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EXECUTIVE SUMMARY 

I lij>h teacher turnover and inadequate teacher preparation represent challenges for education 
policymakers. I ligh turnover among teachers in urban school districts can hurt student achievement 
by exposing more students to inexperienced teachers (Darling- 1 lammond 2000); disrupt schools; 
and impose a high cost on districts that must recruit, hire, and train replacement teachers (Ingcrsoll 
and Smith 2003; King and Neumann 2000). Even teachers who persist may struggle with pedagogy' 
or classroom management if they arc not adequately supported early in their careers (Kauffman ct al. 
2002 ). 

To support beginning teachers, most districts offer some form of teacher induction or 
mentoring, but they often provide a limited set of services in response to an unfunded state mandate 
and with modest local resources (Berry ct al. 2002; Smith and Ingcrsoll 2004). We refer to this usual 
level of induction support as informal or low-intensity teacher induction, which may include pairing 
each new teacher with another full-time teacher without providing training, supplemental materials, 
or release time lor the induction to occur. 

One policy option in response to the problems of high turnover and inadequate preparation is 
to support teachers with a formal, more comprehensive induction program during their initial years 
in the classroom. Support that is intensive, structured, and sequentially delivered is sometimes 
referred to as “comprehensive” induction. It is oltcn delivered through experienced, trained full- 
time mentors and may also include a combination of school and district orientation sessions, special 
in-service training (professional development), classroom observations, and constructive feedback 
through formative assessment. 

In 2(X)4, the U.S. Department of Education’s Institute of Education Sciences contracted with 
Mathematica Policy Research to conduct a large-scale evaluation of comprehensive teacher 
induction. The purpose of the study was to determine whether augmenting the set of services 
districts usually provide to support beginning teachers with a more comprehensive program 
improves teacher and student outcomes. This is the study’s third and final report on the program’s 
impacts. 

To evaluate the impact of comprehensive teacher induction relative to the usual induction 
support, we conducted a randomized experiment in a set of districts that were not already 
implementing comprehensive induction. We assigned 418 elementary schools in 17 urban districts 
by lottery to cither (1) a treatment group whose beginning teachers were offered comprehensive 
teacher induction or (2) a control group whose beginning teachers received the district's usual, less 
comprehensive or intensive induction services. Random assignment ensures that any systematic 
differences in outcomes between the treatment and control group can be attributed to 
comprehensive induction. 

The comprehensive services were provided by either the Educational Testing Service (ETS) of 
Princeton, New Jersey or the New Teacher Center at the University of California, Santa Cruz 
(NTC), depending on the district’s preference. These program providers implemented their 
respective comprehensive programs in each school and district to which they were assigned. The 
providers began by helping districts select the mentors. Beginning teachers in treatment schools 
were then assigned to a full-time mentor with a 12 to 1 ratio. Mentors received ongoing training and 
curriculum of materials to support the teachers’ development. Beginning teachers were offered 
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monthly professional development sessions, opportunities to observe veteran teachers, and an end- 
of-ycar colloquium. 

In 10 of the 17 districts, the services were offered to treatment schools for one year only (“one- 
year districts”). In the remaining 7 districts, sendees were ottered to treatment schools for two years 
(“two-year districts”). Because the two sets of districts implemented different versions of the 
treatment and they were not randomly chosen to implement one or two years of comprehensive 
induction, we present most results separately for one- and two-year districts. 

The research team collected survey and administrative data for four years after the initial 
random assignment in summer 2005. Teacher surveys were used to measure the effect of 
comprehensive induction on support services that teachers reported receiving and the impact it had 
on workforce outcomes (teacher attitudes, teacher retention, and composition of the teacher 
workforce) for the lull sample of 1,009 teachers. We conducted classroom observations to measure 
the impact on teaching practices in the first year for the subsamplc ol approximately 700 teachers 
who were teaching literacy skills. District-provided data on student test scores were used to measure 
the impact on test scores for the subsample of approximately 200 teachers in grades and subjects 
that had both an end of year test and a test of prior achievement Irom the previous year. 

Key findings: 

• During the comprehensive induction program, treatment teachers received more 
support than control teachers.' For example, in the first year they were more likely to 
have a mentor assigned to them (90 versus 72 percent in one-year districts and 96 versus 
79 percent in two-year districts), spent more time with a mentor (85 versus 68 minutes 
per week on average in one-year districts and 108 versus 82 minutes per week on 
average in two-year districts), and participated in more activities such as observing other 
teachers (68 versus 39 percent in one-year districts and 72 versus 47 percent in two-year 
districts), as reported in the spring. The pattern of statistically significant differences 
favoring the treatment group continued in the second year for the districts where 
comprehensive services were offered over two years. I Iowcvcr, treatment teachers in 
districts receiving one year of comprehensive induction received less support in their 
second year on all these dimensions than control teachers in the same districts. In the 
third and fourth years of teaching, treatment teachers received levels of support that 
were similar to their control group counterparts, whether we considered one-year or 
two-year districts. 


1 Unless stated, all comparisons in the executive summary and in the report are statistically significant at the 0.05 
level using a tun skied hypothesis test. Statistical significance means that the observed differences are not likely due to 
chance. 
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• The extra induction support for treatment teachers did not translate into impacts 
on classroom practices in the first year. \Yc observed teachers giving a literacy lesson 
in the spring of their first year and found no impacts on teachers' implementation of the 
literacy lesson, content of the literacy lesson, or classroom culture. 

• For teachers who received one year of comprehensive induction, there was no 
impact on student achievement. In each of the first three years of teachers' careers, 
students of treatment teachers receiving one year of comprehensive induction support 
performed no better on average than students of the corresponding control teachers. 

• For teachers who received two years of comprehensive induction, there was no 
impact on student achievement in the first two years. In the third year, there was 
a positive and statistically significant impact on student achievement. 

- In the third year, in districts and grades in which students’ test scores from the 
current and prior year are available, students of treatment teachers outperformed 
students of the corresponding control teachers on average. These impacts arc 
equivalent to effect sizes of 0.1 1 in reading and 0.20 in math, which is enough to 
move the average student from the 50th percentile up 4 percentile points in 
reading and 8 percentile points in math. 

- These results are based on the subset of data for which students’ test scores from 
the current and prior year arc available. If the analyses arc conducted without 
requiring test scores from the prior year, we do not find an impact on math or 
reading scores. This alternative approach nearly doubles the available sample of 
study teachers but the lack of data on students’ prior achievement results in a less 
precise estimate. This means that we arc less likely to detect a true impact if it 
exists, despite the larger sample size. 

• Neither exposure to one year nor exposure to two years of comprehensive 
induction had a positive impact on retention or other teacher workforce 
outcomes: 

- Treatment teachers did not report being more satisfied or feeling more prepared 
to teach than control teachers at any of the six time points over the four school 
years in which we collected data. 

- There was no impact on teacher retention over the first four years of the 
teachers’ careers. This was true of retention in the original school, the original 
school district, and the teaching profession. 

- We found no evidence that comprehensive induction improved the composition 
of the teacher workforce through selective retention. There were no statistically 
significant positive diflerences between treatment teachers retained in the district 
and control teachers retained in the district in teacher characteristics such as 
college entrance exam scores, college selectivity, anti advanced degrees; nor in 
performance measures such as first year classroom observation scores or third 
year student test scores. 


xxv 



I ixtiHtii* Summary 


Study Design 

Participating Districts: 17 school districts in 13 states participated in the study. 
In addition to willingness to participate in the evaluation, districts had to meet criteria for size, 
poverty, and need for induction: at least 570 teachers in elementary schools, at least 10 elementary 
schools with 50 percent or more students eligible for free or reduced-price meals under the federal 
National School I-unch Program, and net existing comprehensive induction program offered in study 
schools. To be in the study, districts had to have schools with no full-time mentors and an 
expenditure on induction of less than SI, 000 per beginning teacher. Although the districts did not 
form a statistically representative sample of the nation, they were drawn from states with a variety of 
regulatory, administrative, and demographic contexts. Treatment schools in each district worked with 
one of the two providers of comprehensive induction — KTS or NTC — based primarily on district 
preferences. 

Participating Schools: 418 elementary schools participated in the study. Together with 
participating districts, we selected schools lor the study that had eligible beginning teachers and that 
were not already implementing a comprehensive induction program. 

Participating Teachers: 1,009 teachers participated in the study. Within each study school, ail 
eligible teachers participated if they were new to the profession, taught in grades K-6, and were not 
already receiving induction support from a teacher preparation or certification program. 

Random Assignment: W ithin each district, we randomly assigned schools to either the 
treatment group, in which case teachers at the school were offered comprehensive teacher induction, 
or to the control group, in which case teachers at the school took part in the district’s usual set of 
induction services. Assigning entire schools helps ensure that teachers in the control group are not 
receiving the benefits of services offered to the treatment group. 

Years of Treatment: The treatment included one year of comprehensive induction services for 
10 districts (“onc-vcar districts”) and two years of such sendees for the other 7 districts (“two-year 
districts."). We selected a set of districts to receive a second year of the treatment based on such 
factors as whether the mentors were available for a second year. Dividing the sample in this way docs 
not allow for and should not be used to make direct comparisons between districts that received a 
different number of years of treatment. Impacts arc presented separately for one-year and two-year 
districts. 

Outcomes: W e examined impacts on classroom outcomes — evidence ol best teaching practices 
through classroom observations and effects on student test scores — as well as workforce outcomes, 
including teacher satisfaction and preparedness, the rate of teacher retention, anti the composition of 
the teacher workforce. 


Model-based Approach: To estimate impacts, we used regression methods, which compare the 
treatment anti control groups, controlling for the confounding effects of any chance diflcrcnces in a 
range of student, teacher, or school characteristics, such as grade level, students’ eligibility for free or 
reduced price lunch, or teacher certification. The regression model is also used to account for the fact 
that teachers or students are clustered within schtiols. 
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Data 

Teacher Surveys: A teacher survey designed to measure the background of the study 
teachers, receipt of induction services and alternative support services, teacher attitudes, and 
mobility patterns was administered six times over four years. Response rates on teacher surveys 
ranged from 88 to 97 percent for the treatment group and 83 to 94 percent for the control group. 

Classroom Observations: In the first year of the study, trained classroom observers used a 
rubric designed to quantify - evidence of best teaching practice in literacy lessons in 
639 classrooms. The observations focused on teachers responsible for English or language arts. 

Student Achievement Data: Districts provided student test scores for regularly 
administered reading and math tests as well as associated student background data for each of the 
first three years of the study. For each year, the district provided annual test scores from the 
current year (posttest) and the prior year (pretest). Of the 1,009 teachers who began in the study 
in the 2005-2006 school year, districts provided valid student test score data for teachers in the 
most recent year ttf the study, the 2007-2008 school year. The other teachers were either no 
longer teaching in the district, teaching grades or subjects that were not tested, or teaching in one 
of the two districts that did not provide test score data for the 2007-2008 school year. 


Induction Support for Beginning Teachers 

To select a comprehensive induction program and program provider for the study, we issued a 
Request for Proposals (RFP) in 2004. The RFP specified that the induction program include several 
components that earlier research and professional wisdom gleaned from practice had suggested were 
important features of successful teacher induction programs (Alliance for Excellent Education 2004; 
Ingcrsoll anti Smith 2004; Smith anti Ingersoll 2004; Kelly 2004; Serpell anti Bozeman 2000). A 
group of outside expert reviewers ranked the proposals and selected 1 TS and NTT 1 as the providers 
whose programs most closely met the study’s specified requirements. The two programs were 
roughly comparable in structure and included the required components: 

• Carefully selected and trained full-time mentors 

• A curriculum of intensive and structured support for beginning teachers that includes 
an orientation, professional development opportunities, and weekly meetings with 
mentors 

• A focus on instruction, with opportunities for novice teachers to observe experienced 
teachers 

• Formative assessment tools that permit evaluation of practice on an ongoing basis and 
require observations and constructive feedback 

• Outreach to district anti school-based administrators to educate them about program 
goals and to garner their systemic support for the program 
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Mathcmatica contracted with both providers to deliver comprehensive induction services to the 
districts in the study, with nine of the districts assigned to MTS, anil the remaining eight to NTC. 
Staff from WestFd, a subcontractor for Mathcmatica, served on the implementation team and was 
charged with monitoring the implementation of the comprehensive induction services to help 
providers ensure fidelity to the core service model as well as identifying and helping address any 
implementation challenges that arose. 

Both the MTS and NTC programs arc based on a curriculum expected to promote effective 
teaching. The IvTS program defines effective teaching in terms of 22 components organized into 
four domains of professional practice. The components arc aligned with the Interstate New Teacher 
Assessment and Support Consortium (INTASC 1992) principles. The NTC induction model defines 
effective teaching in terms of six Professional Teaching Standards. Ivach standard, or domain, is 
broken into a succession of more discretely defined categories of teaching behaviors. 

The curriculum that formed the foundation of both programs included a number of activities. 
Mentors were asked to meet weekly with treatment teachers for approximately two hours. 
Conversation was expected to center around the induction programs’ teacher learning activities, but 
mentors also exercised professional judgment in selecting additional activities to meet beginning 
teachers’ needs, including observing instruction or providing a demonstration lesson; reviewing 
lesson plans, instructional materials, or student work; or interacting with students to gain an 
additional perspective on teachers’ instructional practices. Treatment teachers were provided 
monthly professional development sessions to complement their interactions with mentors, and the 
MTS districts also offered monthly study groups — mentor-facilitated peer support meetings for 
treatment teachers during which heginning teachers met monthly to discuss their local needs and 
practices. Treatment teachers also observed veteran teachers once or twice during the year. At the 
end of each school year, treatment teachers in both IvTS and NTC districts participated in a 
colloquium celebrating the year's successes and teachers' professional growth. 

The providers adapted the curricula of the second year of their usual induction programs for 
the second year of induction services in the two-year districts. While programs provided induction 
activities to these districts' treatment teachers during the second year that were similar to those in 
the first year, the content was designed to reflect the growth of mentors and beginning teachers and 
the evolution of their circumstances and needs. In two-year districts served by IvTS, mentors led 
Teacher Learning Communities, an adaptation of the first year’s study groups that included specific 
content for each session and a formal structure for teachers to try out approaches to instruction. 
During second year professional development sessions in the two-year districts served by NTC, 
mentors elaborated on standardized topics and designed activities to reflect local needs. 

At the heart of the comprehensive induction services was the support provided by a full-time 
mentor trained by the program providers. The goal of the study was to assign each mentor to 
12 beginning teachers. At the outset of the study, the program providers sought mentor candidates 
with a minimum of five years of teaching experience in elementary school, recognition as an 
exemplary teacher, and experience in providing professional development or mentoring other 
teachers (particularly beginning teachers). During Years 1 and 2, the providers brought their 
respective mentors together for 8-12 days of training spread across 3 to 4 sessions during the 
summer and school year. Trainings previewed the content of upcoming professional development 
sessions and gradually introduced processes of mentor/ mentec work in such areas as reflecting on 
instructional practices and analyzing student work. All induction activities were voluntary for 
beginning teachers. 
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Wc found that assignment to the treatment group changed the pattern of induction serv ices 
reported by beginning teachers. Figures liS.l and ES.2 summarize the timing and intensity of the 
program’s core support — mentoring — measured in minutes per week spent with mentors. Figure 
liS.l presents data lor one -year districts and Figure li.S.2 presents data for two-year districts. The 
figures illustrate statistically significant diflercnces between the treatment and control groups in 
weekly time spent with mentors during the intervention. Meeting time falls significantly for both 
treatment and control groups after the end of the intervention and treatment-control differences 
become statistically insignificant except for a significant negative difference in one-year districts in 
fall 2006. Mentoring is just one measure of induction support. Wc examined dozens of other 
measures and found similar patterns of support over time: 

• In one-year districts, both treatment and control teachers reported receiving 
substantial induction support. However, treatment teachers received more and 
different support than control teachers during the comprehensive induction 
program (their first year of teaching). For instance, relative to control teachers, 
treatment teachers were more likely to have an assigned mentor (90 versus 70 percent in 
fall 2005, p-valuc 0.0(H); 90 versus 72 percent in spring 2006, p-valuc 0.000) and spent 
more time per week meeting with their mentors (87 versus 67 minutes in fall 2005, 
p-valuc 0.007; 85 versus 68 minutes in spring 2006, p-valuc 0.059); these differences 
were all statistically significant. 

• In two-year districts, treatment and control teachers reported receiving 
substantial induction support as well. However, similar to the findings in one- 
year districts, treatment teachers received more and different support than control 
teachers during the comprehensive induction program (their first two years of 
teaching). For instance, relative to control teachers, treatment teachers were more likely 
to have an assigned mentor (between tall 2005 and spring 2007, the percent of treatment 
teachers ranged from 80 to 96 and the percent of control teachers ranged from 54 to 79; 
p-values all 0.000) and spent more time per week meeting with their mentors (between 
fall 2005 and spring 2007, time spent by treatment teachers ranged trom 79 to 124 
minutes and the time spent by control teachers ranged from 41 to 
82 minutes; p-valucs ranged from 0.001 to 0.087); these differences were statistically 
significant with the exception of meeting time in spring 2006. 

• In their second year, immediately following the end of the comprehensive 
induction program, treatment teachers in one-year districts received less and 
different induction support than control teachers. For measures such as the 
percentage of teachers with an assigned mentor and time spent meeting with mentors 
per week, this reflects a significant drop in support among control teachers and an even 
larger significant dn>p in support among treatment teachers. A survey ol teachers in one- 
year districts conducted in fall 2006 showed that there were statistically significant 
differences favoring the control teachers in several areas: for instance, treatment teachers 
were less likely than control teachers to have an assigned mentor (20 percent of 
treatment teachers versus 29 percent of control teachers, p-valuc 0.017) and spent less 
time per week meeting with their mentors (19 minutes for treatment teachers versus 
59 minutes for control teachers, p-valuc 0.002). No statistically significant differences 
favoring the treatment teachers were found. 
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Figure ES.1. Treatment-Control Differences in Total Minutes Spent in Mentoring per Week: One-Year Districts 



Fall 2005* Spring 2006* Fall 2006* Fall 2007 Fal2O08 
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Source: Mathematica First. Second. Third. Filth, and Sixth lnduC.Cn Activities Slaveys admnistered in fall 2M5. spnng 2C06. 

tall 2003 fall 2007. and fall 2008 to all study teachers. 

Note N • 503 teachers in fall 2005. 499 teachers in spring 2006. 472 teachers in fall 2006. 426 teachers in fall 2007. and 

39B teachers in tail 2006. 

• Treatment-control difference is signrfCanfly different from aero at the 0.05 level 

Figure ES.2. Treatment-Control Differences in Total Minutes Spent in Mentoring per Week: Two-Year Districts 



Fall 2005* Spring 2006 Fall 2006* Spring 2007* Fall 2007 ’ Fall 2008 


• Treatment -—•---Control 

Source: Mathematica First. Second. Tllrd. Filth, and Sixth Induction Activities Suveys administered in fall 2005. spnng 2006. 

tall 2006. fall 2007. and fall 2008 to all study teachers and Fourth Induction Activities Survey administered in spring 
2007 to study teachers in l*o-year restricts. 

Note N • 395 teachers m fall 2005. 386 teachers in spring 2006. 360 teachers in fat 2006. 372 teachers «t spring 2007. 

326 teachers in fall 2007. and 321 teachers in fal 2006. 

‘ Treatment-control difference is significantly different from aero at the 0.05 level 
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• In the third and fourth years of teaching, after the intervention ended for all 
districts, treatment and control teachers received similar levels of support. In 

both one- and two-year districts, there were statistically significant differences in fewer 
than 7 percent of the 134 measures we surveyed. 

Impacts on the Classroom 

To measure the cftcct of comprehensive induction in the classroom, we compared these 
outcomes for teachers in the treatment anti control groups: (1) the use of best practices in teaching a 
literacy lesson and (2) standardized test scores for teachers' students. For the classroom practices 
analysis, we focused on those teachers responsible for Hnglish language arts or literacy classes 
(698 teachers). For test score analyses, we focused on teachers in tested grades and subjects (about 
200 teachers per year). Results pertaining to literacy instruction do not necessarily apply to teachers 
of other subjects. Similarly, results for teachers in tested grades do not necessarily apply to teachers 
of other grades or subjects. 

Classroom Practices. We sent trained observers into treatment and control classrooms to 
administer the Diagnostic Classroom Observation (DCO) in spring 2006 (year 1 of the study). We 
observed literacy (or rcading/languagc arts) lessons in 639 classrooms. Based on a set of 
16 indicators, observers scored teachers on a five-point scale, ranging from “no evidence" to 
“extensive evidence" of effective teaching practice. We produced summary scores by averaging the 
indicators within each of three domains. 

After controlling for teacher and school characteristics, we observed no statistically significant 
differences between treatment and control teachers’ performance on the three domains measured by 
the DCO: implementation of a literacy lesson, content of a literacy lesson, or classroom culture. 

Student Achievement. To measure impacts on student achievement we compared the test 
scores for students of treatment teachers at the end of the year (posttest) to those of control 
teachers, accounting for any preexisting differences in prior achievement (pretest) and background 
characteristics ol students and teachers. We present results in this report lor the teachers’ third year, 
the 2007—2008 school year. This was the second year alter the treatment ended in one-year districts, 
and the first year after the treatment ended in two-year districts. 

For one-year districts, the impacts on math anti reading scores in the study's third year were not 
significantly different from zero. For two-year districts, the impacts on math and reading scores were 
both positive and statistically significant. The results for the two-year districts, presented in Figure 
ES.3, show that comprehensive induction led to an increase in test scores of 1 1 percent of a 
standard deviation in reading, which is enough to move the average student from the 50th percentile 
up 4 percentile points, and an increase of 20 percent of a standard deviation in math scores, enough 
to move the average student up 8 percentile points. 


1 Results for the first year showed no overall impacts in either math or reading, as documented in an earlier report 
(Glazerman et al. 20118). In the second year, when we looked separately at one and two-year districts, we continued to 
find no overall impacts in either Subject (lsenberg et al. 2009). 


XXXI 



I Ixauthv Summary 

As specified in the study design (Glazcrman ct al. 2005), the eligible sample for the test score 
analysis was limited to teachers in tested grades and subjects. Because our design also aimed to 
account tor preexisting differences in student achievement, we included only students who took 
both a pretest and posttest, thereby excluding the lowest tested grade. For example, in many 
districts, the lowest tested grade was grade 3. If we expand the sample of teachers anti grades by not 
requiring test scores trom the prior year (approximately doubling the number of teachers included in 
the analyses), we do not find an impact on math or reading scores in either one-year or two-year 
districts. I lowevcr, the lack of data on students’ prior achievement produces a less precise estimate. 
This means that with the expanded sample we arc less likely to detect a true impact if it exists, 
despite the larger sample size. 

Figure ES.3. Impacts on Test Scores, Year 3 (grades with current and prior year tests) 

Effect Size 

0.25 
0.20 
0.15 
0.10 
0.05 
0.00 
- 0.05 
- 0.10 
- 0.15 

Source: Mathematics analysis using data trom the 2006-2007 and 2007-2008 school years provided by 

participating school districts: Mathematics Teacher Background Survey administered in fall 2005 to ail 
study teachers. 

Note: Data are regression adjusted and account for clustering of students within schools. N = 99 teachers and 

1,690 students in reading and 95 teachers and 1.629 students in math in one-year districts; 74 teachers 
and 1.347 students in reading and 68 teachers and 1 .198 students in math in two-year districts. 

"Treatment-control difference is significantly different from zero at the 0.05 level. 



xxxii 





I ixttMfire Summary 


Impacts on the Teaching Workforce 

To measure the eftccl of comprehensive induction on the teacher workforce, we examined the 
impacts of comprehensive induction on (1) teachers’ altitudes that relate to career decisions, 
including their satisfaction with teaching and their feelings of preparedness to deal with different 
aspects of their jobs; (2) teachers' mobility; and (3) the mix of teachers who decide to stay in the 
district. 

Teacher Attitudes. I 'sing items from the induction activities surveys, we measured teachers’ 
feelings of satisfaction in 19 areas on a four-point scale ranging from “very dissatisfied” to “very 
satisfied” and teachers' feelings of preparedness in 13 areas on a four-point scale from “not at all 
prepared" to “very well prepared.” f actor analysis suggested that teacher satisfaction and teacher 
preparedness could be grouped into three categories each: satisfaction with (1) school, (2) class, and 
(3) career; and preparedness to (1) instruct, (2) work with students, and (3) work with others. 

Comprehensive induction did not make teachers feel more satisfied or prepared. Teachers from 
the treatment and control groups reported feelings of satisfaction and preparedness that differed by 
0.1 or less on the four-point scales at all points in time at which we measured these attitudes. These 
results arc robust to alternate ways of aggregating the data, including rescaling the responses as 
binary variables and considering each original area of satisfaction or preparedness individually rather 
than combined into larger categories. 

Teacher Mobility. Comprehensive induction did not make beginning teachers more likely to 
stay in their schools, their districts, or the profession. To measure teacher mobility, we surveyed 
teachers annually to learn whether they were still teaching, and if so, where they were teaching. By 
the end of the study period, 69 percent of teachers in one-year districts and 63 percent of teachers in 
two-year districts were still teaching in their original district. Figures ES.4 and F.S.5 illustrate the lack 
of statistically significant treatment-control differences. They show a set of survival curves that plot 
the percentage of teachers retained in their original district in each year of the study for the one-year 
districts (Figure 1 vS.4) and two-vear districts (Figure FIS. 5) separately by treatment status. 

The treatment group is represented by a solid line anti the control group is represented by a 
dashed line. The differences were statistically insignificant at each time point in both one-year and 
two-year districts. When we conducted similar analyses of retention in the profession and the 
original school, we also found no significant treatment-control differences. For example, in terms of 
retention in the profession, 87 percent of teachers in one-year districts returned for a fourth year of 
teaching, with no significant difference between treatment and control teachers; 85 percent of 
teachers in two-year districts returned for a fourth year of teaching with the treatment-control 
difference being statistically insignificant. 

When we examined in more detail where the movers and leavers went, we still did not find 
significant differences between treatment anti control group mobility patterns in one-year or two- 
year districts. For example, statistically similar percentages of treatment and control teachers (about 
47 percent) stayed in the same school in one-year districts, whereas 15 percent moved within the 
district, 10 percent moved to a new district (including charter schools), and 5 percent moved to 
private schools. We examined the reasons teachers gave for moving to a teaching position outside 
their original school or for leaving the profession and again found no significant treatment-control 
differences. 
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Figure ES.4. Survival Curve for One-Year Districts: Percentage Remaining in the District 



Source; Mathemattca First. Second, and Third Teacher Mobility Slaveys administered in fall 2006. fall 2007. and fall 2008 to 
all study teachers. 

Note Data pertain to teachers in one-year districts participating m tie study. N • 561 teachers in fall 2005. 500 leaders in 

fall 2006. 476 teachers in fall 2007. and 417 teachers in fall 2008. 

Treatment-ccntrol differences are not significantly different from zero at the 0.05 level. 
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Figure ES.5. Survival Curve for Two-Year Districts: Percentage Remaining in the District 
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Source: Mathematics First. Second, and Third Teacher Mobility Suveys administered m fall 2006. fall 2007. and fall 2008 to 

all study teachers. 

Note Data pertain to teachers *i hvo-year districts participating in the study. N - 448 teachers in fall 2005. 382 teachers in 

tall 2005. 364 teachers in fall 2M7. and 345 teachers in fall 2008. 

Treatment-ccntrol differences are not significantly different from zero at the 0.05 tevel. 
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Wc concluded sensitivity analyses to confirm the findings of no impacts on retention rates. To 
address concerns about potential bias introduced by which teachers respond to the surveys, wc used 
a variety of alternative methods for defining mobility, defining the eligible sample, and estimating 
the impacts, and continued to find no impact on mobility for all plausible assumptions regarding 
survey nonrespondents. 

Composition of the Workforce. Wc investigated the impacts of comprehensive induction on 
the composition of the teaching force in the district to understand whether comprehensive 
induction raised the quality of teaching by encouraging the weakest teachers to leave or lowered it by 
discouraging the strongest ones from staying. To test this hypothesis, wc used measures of teachers' 
professional qualifications, classroom practice ratings from their first year, and student test data 
from the third year of teaching. 

Wc found similar levels of professional qualifications for treatment and control teachers who 
remained in their original district in their fourth year of teaching (“stayers”). Restricting the sample 
to stayers, wc compared the average values of several teacher characteristics for treatment teachers 
to control teachers. The top panel of Table MS.1 shows results for one-year districts; Table MS. 2 
shows results for two-year districts. For each teacher characteristic, wc found no statistically 
significant difference between treatment stayers and control stayers. 

Similarly, when wc measured teacher quality using classroom performance measures, 
comprehensive induction did not improve composition of the teacher workforce. The bottom 
panels of Tables F.S.l and liS.2 focus on performance measures. Restricting the sample to the 
stayers, treatment teachers did not exhibit stronger evidence than control teachers of effective 
elassrtxim practices during the first year of the study. As shown in the bottom panel of Table MS.l, 
which refers to one-year districts, stayers in the control group outperformed stayers in the treatment 
group in raising students' math test scores by a statistically significant margin in year 3. There was no 
significant dillerencc in reading test scores. Results for two-year districts are shown in Table liS.2. 
Unlike the full group of teachers included in the analysis in the third year of the study, among stayers 
returning for a fourth year in their original district there was no significant diftercnce between 
treatment and control teachers in reading or math. 

Association Between Levels of Induction Support and Outcomes 

To complement the experimental analysis, which was based on the random assignment of 
teachers to treatment and control groups, wc conducted two correlational analyses. Correlational 
analyses should be interpreted with caution because they arc not causal. Unlike with the randomized 
experiment that used variation in treatment status, the variation in induction services that wc explore 
here can be caused by confounding factors that also explain teachers' attitudes, workforce 
attachment, and effectiveness in the classroom. 

The first correlational analysis tests whether there is a relationship between the study outcomes 
and the level or intensity of induction services more generally. Wc exploit the natural variation in 
induction support that occurred across teachers within and between experimental groups to 
determine whether there is a relationship between the level of induction support and the study 
outcomes. Wc use the same sample (treatment and control) and the same regression methods as the 
experimental analyses, but instead of assignment to treatment status as the key explanatory variable, 
wc used four measures of induction support based on the number of years the teacher had an 
assigned mentor anti indices of the breadth, intensity, and instructional focus of induction sendees 
constructed from the sun'ey data on induction activities. 
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Table ES.1. Characteristics of District Stayers After Three Years, by Treatment Status (Percentages Except 
Where Noted): One-Year Districts 



Treatment 

Control 



Teacher Characteristic 

Stayers 

Stayers 

Difference 

P-value 

Background Characteristic 





College entrance exam score 
(SAT combined score or equivalent) 

1040 

1013 

27 

0.325 

Attended highly selective college 

27.5 

27.2 

0.3 

0.954 

Major or minor in education 

78.7 

80.9 

-2.1 

0.665 

Student teaching experience (weeks) 

15.8 

15.4 

0.4 

0.772 

Highest degree Is master's or doctorate 

22.4 

28.2 

-5.8 

0.311 

Entered the profession through traditional four- 
year program 

67.6 

58.9 

8.7 

0.171 

Certified (regular or probationary) 

94.7 

94.7 

0.0 

0.999 

Career changer 

14.5 

12.8 

1.7 

0.682 

Sample Size (Teachers) 

148 

139 



Sample Size (Schools) 

88 

84 



Year 1 Classroom Observaton Score (on 1 to 5 
scale) 





Content of a literacy lesson 

2.3 

2.6 

-0.3" 

0.024 

Implementation of a literacy lesson 

2.6 

2.8 

-0.2 

0.151 

Classroom culture 

3.0 

3.1 

-0.1 

0.607 

Sample Size (Teachers) 

100 

94 



Sample Size (Schools) 

71 

65 



Year 3 Test Scores (standard deviation units) 





Reading 

-0.27 

-0.28 

0.02 

0.764 

Math 

-0.26 

-0.11 

-0.15" 

0.008 

Sample Size (Teachers) 

25 

36 



Sample Size (Schools) 

23 

29 




Source: Mathematics analysis using data from the College Board and ACT, Inc.; data from the 2006-2007 and 

2007-2008 school years provided by participating school districts: Mathematics Third Teacher Mobility 
Survey administered in fall 2008 to all study teachers: Mathematics classroom observations conducted 
in spring 2006 . 

Note: Data pertain to teachers in one-year districts participating in the study. Data are weighted to account for 

the study design. The analysis of college entrance exam scores relied on a smafcer sample 
(84 treatment and 86 control teachers and 6! treatment and 62 control schools). The analysis of Year 3 
Test Scores relied on a different sample for reading (26 treatment and 34 control teachers and 
24 treatment and 27 control schools) and math (per table values). 

"Significantly different from zero at the 0.05 level. 
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Table ES.2. Characteristics of District Stayers After Three Years, by Treatment Status (Percentages Except 
Where Noted): Two-Year Districts 



Treatment 

Control 



Teacher Characteristic 

Stayers 

Stayers 

Difference 

P-value 

Background Characteristic 





CoBege entrance exam scores 
(SAT combined score or equivalent) 

905 

935 

-30 

0.330 

Attended highly selective college 

23.7 

21.4 

2.3 

0.703 

Major or minor in education 

67.8 

66.6 

1.2 

0.874 

Student teaching experience (weeks) 

12.3 

12.3 

0.1 

0.975 

Highest degree Is master's or doctorate 

16.7 

10.2 

6.5 

0.196 

Entered the profession through traditional four-year 
program 

61.1 

66.4 

-5.4 

0.443 

Certified (regular or probationary) 

95.8 

92.9 

2.9 

0.366 

Career changer 

17.1 

11.7 

5.4 

0.292 

Sample Size (Teachers) 

124 

93 



Sample Size (Schools) 

67 

52 



Year 1 Classroom Observation Score (on 1 to 5 scale) 





Content of a literacy lesson 

2.4 

2.4 

0.0 

0.690 

Implementation of a bteracy lesson 

2.7 

2.6 

0.1 

0.583 

Classroom culture 

3.1 

3.1 

0.1 

0.624 

Sample Size (Teachers) 

87 

62 



Sample Size (Schools) 

50 

41 



Year 3 Test Scores (standard deviation units) 





Reading 

-0.23 

-0.27 

0.05 

0.302 

Math 

-0.12 

-0.24 

0.11 

0.054 

Sample Size (Teachers) 

31 

16 



Sample Size (Schools) 

21 

14 




Source: Mathematica analysis using data from the College Board and ACT, Inc.; data from the 2006-2007 and 

2007-2008 school years provided by participating school districts: Mathematica Third Teacher Mobility 
Survey administered Bo fall 2008 to all study teachers; Mathematica classroom observations conducted 
in spring 2006 . 

Note: Data pertain to teachers in two-year districts participating in the study. Data are weighted to account for 

the study design. The analysis of college entrance exam scores relied on a smaller sample 
(56 treatment and 47 control teachers and 40 treatment and 35 control schools). The analysis of Year 3 
Test Scores relied on a different sample for reading (33 treatment and 17 control teachers and 
24 treatment and 15 control schools) and math (per table values). 

None of the differences is statistically significant at the 0.05 level. 
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The relationships between induction support and student achievement were mixed: statistically 
significant (positive) and statistically insignificant. For math, there were both positive, statistically 
significant associations and statistically insignificant associations. For reading, there were only 
statistically insignificant associations. 

Beginning teachers who received more induction support reported being more satisfied, on 
average, than those who received less. Induction intensity and instructional focus stood out as the 
two aspects of support that were positively related to teacher attitudes. The relationship of induction 
sendees to teachers' reported feelings of preparedness exhibited a similar pattern hut with only one 
statistically significant relationship (induction intensity). These feelings of satisfaction and 
preparedness did not translate into better retention. None of the four measures ol beginning teacher 
support was related to retention in the district or in the profession. 

In the second correlational analysis, we examined whether better outcomes arc associated with 
matching between the mentor and mentcc on two dimensions, race /ethnicity and grade. We 
conducted this analysis using only the treatment group, which is the part ol the sample for which we 
have detailed information on mentor background. 

Beginning teachers who had the same race/ ethnicity as their mentor or taught the same grade 
as had their mentor had lower rates of retention in the district and in the profession than those who 
did not have such a match. This contradicts the hypothesis that better matching would produce 
better outcomes. When we examined the other two outcomes, teacher attitudes and student 
achievement, we found no evidence of a statistically significant relationship with a mentor match. 
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I. INTRODUCTION AND BACKGROUND 


Policymakers and researchers have recently been concerned about shortages of highly qualified 
teachers in hard-to-staff school districts (I toward 2003; Ng 2003), particularly in urban areas 
(Murphy ct al. 2003). These concerns have generated debate about how to attract new teachers 
(Levin and Quinn 2003), although some researchers have argued that the shortages may have less to 
do with the difficulties of attracting new teachers than with retaining them (Ingcrsoll 2001). A 
frequently cited statistic from national data on teacher mobility suggests that 24 percent of beginning 
teachers leave the classroom by the end of their second year and 46 percent leave by the end of their 
fifth year (Ingcrsoll 2003). 


High teacher turnover can have negative consequences. It can hurt student achievement by 
exposing more students to inexperienced teachers (Darling- 1 lammond 2000). It can also impose a 
high cost on districts that must recruit, hire, and train replacement teachers, and it can disrupt 
schools (Ingcrsoll and Smith 200.3; King and Neumann 2000; Alliance for Lxccllent I Education 


2004). 


Ii.vcn teachers who manage to persist can find themselves struggling if they arc not adequately 
supported early in their careers, especially it they were not adequately prepared for the challenges of 
the classroom. The hardcst-to-staff schools tend to have classroom conditions that challenge even 
the best-trained teacher candidates. Teachers who start their careers in these settings may face 
challenges in pedagogy or classroom management for which they were not fully prepared (Kauffman 
et al. 2002). 


One policy option in response to the problems of high turnover and inadequate preparation is 
to support teachers with a formal, comprehensive induction program during their initial years in the 
classr<x»m. Such a program might include a combination of school and district orientation sessions, 
special in-service training (professional development), mentoring by an experienced teacher, 
classroom observation, and constructive feedback through formative assessment. To support 
beginning teachers, most districts offer some form of teacher induction or mentoring, but they often 
provide a limited set of services in response to an unfunded state mandate anti with modest local 
resources (Berry ct al. 2002; Smith and Ingcrsoll 2004). An example of informal or low-intensity 
teacher induction includes pairing each new teacher with another full-time teacher without providing 
any training, supplemental materials, or release time for the induction to occur. In short, although 
teacher induction is common, induction that is intensive, structured, and sequentially delivered in 
response to teachers’ emerging pedagogical needs is not common. Throughout this report, we refer 
to the more formal, structured programs as “comprehensive” induction. 


One reason that school districts do not offer more support to new teachers could be that 
comprehensive teacher induction is expensive. Costs for induction programs, as estimated in recent 
literature, range from SI, 660 to $6,605 per teacher per year (Villar and Strong 2007; Alliance for 
I lxccllent I education 2004). Moreover, there is little empirical evidence on whether investing 


* These reports note costs lor live programs — lour art two yiar programs ami out is a one-year program. The data 
sources include state, district, county, and local data. The period to which the data pertains is 200.V2<XM for three 
programs and unspecified for the other two. Several other studies of the costs of teacher turnover present estimates of 
induction or teacher training costs, but these measures ate expressed in terms of costs per vacancy. Without additional 
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additional resources in a more comprehensive, and hence more expensive, induction program would 
help districts attract, develop, and retain beginning teachers. 

According to several research reviews (Ingcrsoll and Kralik 2004; Totterdell et al. 2004; Lopez 
et al. 2004; Botman and Dowling 2008), studies of teacher induction to date have been neither 
conclusive nor rigorous. Research based on federal statistics (for example, Shcn 1997; Smith and 
Ingcrsoll 2004; I lenkc et al. 2000; Alt and 1 Icnke 2007) can provide a useful, nationally 
representative perspective on the issue, but it is limited in the extent to which it can capture the 
intensity of induction supports and in the range of outcomes that can be examined. Research at the 
local level (for example. Fuller 2003; Youngs 2002; Rockoff 2008; Youngs 2007; Wcchsler et al. 
2010) has yielded more detailed descriptions of teacher supports but, like the national studies, has 
relied on non-cxpcrimental approaches that do not necessarily provide unbiased estimates of the 
causal impacts of interest: the retention rate for participants or test scores of participants’ students 
compared to what they would have been in the absence of the program. Researchers in this tradition 
have attempted to address selection bias to varying degrees. Some researchers have reported 
retention rates for program participants absent a comparison group or have simply referred to the 
overall state retention rate as a benchmark (Odell and Ferraro 1992; Tushnet et al. 2002). 

Congressional interest in formal teacher induction has grown, despite the lack of evidence. The 
No Child I -eft Behind Act of 2001 (NCI.B), which reauthorized the Elementary and Secondary 
education Act of 1965 (ESEA), emphasizes the importance of teacher quality in student 
improvement. Title 11, Part A, of ESEA — the Improving Teacher Quality State Grants program — 
provides nearly S3 billion per year to states to train, recruit, and prepare high-quality teachers. The 
implementation of teacher induction programs is one allowable use of these funds. Current 
discussions on the reauthorization of NCI.B argue for a continued focus on supporting teachers 
through professional development opportunities and teacher-mentoring programs, with a call to 
fund "proven models” to meet these objectives. In addition, the I ligher education Opportunity Act 
of 2008 authorizes grants that include teacher induction or mentoring programs for new teachers. 
These initiatives demonstrate federal interest in a policy response grounded in providing induction 
support as a core means to improve teacher quality. They also, however, stress the need to conduct 
rigorous research to determine whether efforts to implement comprehensive teacher induction 
programs produce a measurable impact on teacher retention and other positive outcomes for 
teachers and students. 

A. Research Questions and Study Design Overview 

To provide Congress and state and local education agencies with the scientific evidence that will 
support sound decisions about teacher induction, the National Center for education Evaluation and 
Regional Assistance within the U.S. Department of Education’s (ED) Institute of Education 
Sciences (IES) contracted with Mathcmatica Policy Research to conduct the Evaluation of the 
Impact of Teacher Induction Programs. The study examines whether augmenting the set of services 
districts usually provide to support beginning teachers with a more comprehensive program 


(continued) 

information on the number of vacancies, this measure does not provide sufficient information to be helpful to districts 
considering tin adoption of an induction program. Sec National Commission on Teaching and America’s Future (2007), 
Barnes et al. (2007), Milantwski an.l Odden (2007), anti I-uller (2000). 
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improves teacher retention rates and other positive teacher and student outcomes. More specifically, 
the analysis is designed to address the following research questions: 

1. What is the effect of comprehensive teacher induction on the types and intensity of 
induction serv ices teachers receive, relative to the types and intensity of services they 
receive from districts' current induction programs? 

2. What impacts docs comprehensive induction have in the classroom? Specifically, what 
are the impacts on: 

a. teachers’ classroom practices? 

b. student achievement? 

3. What impacts docs comprehensive induction have on the teaching workforce? 
Specifically, what arc the impacts on: 

a. teacher attitudes (satisfaction and feelings of preparedness)? 

b. teacher retention, including retention in the district and in the profession? 

c. composition of the teaching force? 

As part of this study, we issued a request for proposals in 2004 to identify a promising 
comprehensive teacher induction program. Among the proposals received in response to our 
request, two described highly similar programs operated by different providers; each program earned 
the highest rating from an expert review committee. The providers were I educational Testing Service 
(ETS), Princeton, New Jersey, and the New Teacher Center (NTC) at the University of California— 
Santa Crux. Mathematica contracted with both providers to deliver one year of the services that we 
characterize as comprehensive. Of the 17 districts participating in the study, UTS operated in 
9 districts, and NTC operated in 8 districts. 

The study used an experimental design in which we randomly assigned a selected group of 
elementary schools within each of the l - participating districts either to a treatment group, which 
received comprehensive teacher induction Irom cither HTS or NTC (depending on the district) or to 
a control group, which took part in the district’s usual teacher induction program if one existed. We 
assigned 418 elementary schools with 1,009 eligible beginning teachers across the 17 urban districts 
with at least ten high-poverty elementary schools. Although the districts selected for the study did 
not form a statistically representative sample of the nation, they were drawn from 13 states with a 
variety of regulatory, administrative, and demographic contexts. The study includes elementary 
schools only. 

After the first year of intervention services was delivered in treatment schools, 1I\S deckled to 
expand the treatment to include a second year of sendees for a subsample of the districts, in clfect 
creating two studies: one for districts that received one year of sendees, and the other for districts 
that received two years. The teachers assigned to treatment in the onc-ycar districts started in fall 
2005 and received induction sendees in the 2005—2006 school year; the teachers assigned to 
treatment in two-year districts also started in fall 2005 but received services in the 2<X)5— 2006 and 
2006-2007 school years. 

We selected the districts to receive a second year of the treatment based on factors such as 
whether the mentors who had been trained within the district by HTS or NTC] were available for a 
second year and whether the group of districts selected for a second year would include 
approximately one-half of the total number of teachers participating in the evaluation. Dividing the 
sample in this way does not allow for and should not be used to make direct comparisons between 
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the districts that received one year of treatment and those that received two years of treatment, but 
instead it allows us to investigate the effectiveness of onc-vcar programs separately Irom that of two- 
year programs. Seven districts (four for ETS and three for NTC) continued the program to a second 
year. 

This report presents findings from all three years of the study hut emphasizes findings Irom the 
third year of the study. Except where noted, we present findings separately for the set of 10 districts 
that received one year of treatment and the other set of 7 districts that received two years of 
treatment. 

Researchers from WestEd, a subcontractor to Mathematica, monitored the implementation of 
the comprehensive induction services. WestHd staff played a critical role by providing regular, on- 
site oversight of the implementation to help ensure that it was faithlul to the core service model and 
to identify and help address any implementation challenges that arose. 

B. Previous Findings from the Study 

Two interim reports from this study (Cilazcrman ct al. 2008; Isenberg et al. 2009) showed that 
teachers assigned to the treatment group reported more induction support than control teachers 
while treatment sendees were being offered, but they also showed that the additional support did 
not translate into positive impacts on key outcomes after either ol the first two years of the study.' 
The first year report showed that the offer of comprehensive induction support amounted to a 
greater likelihood of having a mentor formally assigned to beginning teachers (93 versus 73 percent) 
in the teacher’s first year, more time spent in meetings with the mentor (93 versus 74 minutes per 
week), and greater frequency of receiving assistance in all 10 induction activities asked about for the 
week preceding the spring 2006 survey (such as suggestions to improve practice and help with state 
and district standards) and in all 22 areas asked about for the three months preceding the spring 
survey (including classroom management, reviewing student work, and communicating with 
parents). 

The first year report found no positive impacts on classroom practices, student achievement, 
teacher retention, or the composition of the district’s teaching workforce. Nor did the first interim 
report find any evidence of positive impacts on teachers’ satisfaction or feelings of preparedness 
((ilazerman ct al. 2008). The followup after the second year continued to find treatment-control 
contrasts in induction supports favoring the treatment group in two-year districts, but revealed no 
impacts on student achievement, teacher attitudes, or teacher mobility in either one-year or two-year 
districts (Isenberg ct al. 2009). 

The current report summarizes all of the findings for the study through the final followup, 
which took place after the study teachers’ third year. 

C. Conceptual Background for the Study 

To answer the research questions, we began by identifying the pathways through which teacher 
induction programs could lead to teacher and student outcomes. Figure 1.1 illustrates these pathways 

' All comparisons discussed in this report arc statistically significant at the 0.05 kvel unless otherwise stated. 
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and highlights some of the contextual factors that arc useful to consider when planning and 
interpreting these analyses. More specifically, the figure shows how induction program components, 
contextual factors, and other mediating factors might affect teacher and student outcomes. 


Figure 1.1. Effects of Teacher Induction on Teacher and Student Outcomes: Conceptual Framework 



Intervention 


Mediating Factors 


Final Outcomes 


Context. Context is important. The structure and functioning of an induction program are 
likely to be influenced by the characteristics of the local area, the school, the beginning teacher’s 
classroom, the teacher, and her students (Figure 1. 1, Box A). Teacher and student outcomes may In- 
directly affected, for example, by neighborhood demographics, the degree of administrative and 
financial support for beginning teachers, the percentage of a classroom’s students with special needs 
or special education status, and teachers’ employment histories. For this reason, the study examines 
variation within each of 17 districts by using district as a stratum within which random assignment is 
conducted (sec Chapter II). 

Induction Program. Induction programs may include a variety of possible components 
(Figure 1.1, Box B). There is no onc-sizc-fits-all model of teacher induction: different programs 
emphasize different approaches. For instance, programs may stress to a greater or lesser degree 
components such as orientation, assessment, professional development workshops, mentoring/ peer 
coaching, small-group activities, and classroom observation. Presumably, the more intense the 
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emphasis on a given component, the larger the effect it will have on outcomes. But even the 
intensity with which a program implements a given component may vary in terms of quality, 
duration, and frequency. In this study, teachers in the treatment group received a specially selected 
comprehensive program of induction supports. 

Counterfactual Condition. This study docs not compare comprehensive induction to no 
induction or support for beginning teachers. Rather, it addresses the policy-relevant question: What 
would happen if school districts that were not already implementing comprehensive programs were 
to begin doing so? The school districts in the study that represent this state of the world (the 
countcrlactual condition) were carefully selected to he the ones that might consider adopting 
comprehensive programs in the future. They may already have an existing set of informal 
arrangements for supporting beginning teachers, but the expectation is that a new, comprehensive 
program would expand these supports and possibly change the means by which support is provide 
and the focus ot that support. Thus in Figure 1.1 we hypothesize that the breadth, intensity, and 
nature of induction services (Box B) will differ on average for treatment and control classrooms. 

Outcomes. Induction may benefit school districts in two ways: by strengthening the teacher 
workforce through reducing attrition and/or improving the composition of the workforce (Figure 
1.1, Box F) and by enabling teachers to improve student academic outcomes (Figure 1.1, Box F). 
Induction may affect mediating factors that help explain changes in these final outcomes. For 
instance, two possible precursors to teacher mobility are dissatisfaction and the feeling of being 
unprepared, both of which can presumably be mitigated with more intensive induction support 
(Figure 1.1, Box C). In addition, students’ academic outcomes may improve through the mediating 
factor ol improved classroom practices (Figure 1.1, Box I)). 

D. Organization and Content of This Report 

The conceptual framework presented above guided the organization of this report. After 
presenting the methods (Chapter 11) and data (Chapter 111), the report outlines the induction 
program components under study and the services that treatment and control teachers report 
receiving, in Chapter IV. Next, we present estimates of the effect of the treatment by examining 
impacts on classroom practices and student test scores (Chapter V) and effects on the teaching 
workforce by examining impacts on teacher attitudes and mobility behavior (Chapter VI). The final 
chapter presents correlational analyses that relate measures of different aspects of induction to key 
outcomes (Chapter VII) as a way to add context to the experimental findings. 
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II. STUDY DESIGN AND METHODS 

The centerpiece of the design for the teacher induction evaluation is the use of random 
assignment to construct a group of teachers who were exposed to comprehensive teacher induction 
services (treatment) and an equivalent group who were exposed to the induction services normally 
offered by the districts (control). This chapter documents the study design, discusses the methods 
for selecting districts, schools, and teachers for inclusion in the study, and describes the data analysis 
methods. 


The sample selection process described in this chapter is summarized in Figure 11.1. Although 
we undertook a purposeful selection of districts and schools, the schools, once selected to be in the 
study, were then randomly assigned within each district to a treatment or control group. This 
ensures that the resulting impact estimates arc internally valid. The description of the district and 
school selection process below is meant to help readers understand the population to which the 
findings generalize. 


A. Selection of Districts 


We sought a group of approximately 16 to 20 school districts that were not already providing 
comprehensive teacher induction in all the schools that needed it, but would be candidates for future 
adoption of sucb a program. The initial list of targeted districts was selected according to size and 
poverty levels in order to guarantee a sufficiently large sample for statistical precision while including 
hard-to-staff schools. We first used data from the National Center for Ivducation Statistics’ 
Common Core of Data (CCD) 2004-2005 to identify all school districts in the United States with at 
least 570 teachers in elementary schools and at least 10 elementary schools with 50 percent of 
students eligible for free or reduced-price meals under the federal government’s National School 
I.unch Program (NSLP). We developed these size and poverty targets in consultation with the 
Institute of Education Sciences (IES), based on an earlier feasibility analysis (Cilazcrman ct al. 2005). 
Nationally, 98 districts were determined to meet these targets. 

We narrowed the list of districts through a screening and recruitment process. Mathematica 
subcontracted with the Penn Center for Educational Leadership (CEL) at the University of 
Pennsylvania to conduct a series of screening interviews with state and district officials to determine 
each district's suitability for inclusion in the study. Beginning with the list of 98 districts, 
Mathematica and CEL eliminated 2 districts that were outside the continental United States and 43 
that had previous exposure to teacher induction programs of similar intensity and 
comprehensiveness to the ones selected for the study. Most of those districts w r crc in California, 
Louisiana, Ohio, or Texas, but we also eliminated districts in other states that reported hiring stalf to 
provide mentoring services full time, offering stipends of more than $1,000 per mentor (for one-on- 
one mentoring), or budgeting an equivalent of $ 1 , 0 (K 1 or more per beginning teacher for induction 
services. 


We eliminated another 36 districts that refused to participate, had no interest in implementing 
an induction program, or did not believe that they could benefit from the intervention being offered. 
Many such districts were in the process of reducing their teaching force and therefore did not want 
to introduce interventions to promote retention. 



II. Shut) Of sign and Methods 


Figure 11.1. Sample Selection Flow Chart 



At the end of the screening and recruiting process, we had a final sample of 17 school districts 
in 13 states. By selecting districts that both met our criteria and had leaders who agreed to be in the 
study, we identified those most likely to need and implement comprehensive teacher induction in 
the future. These districts, with some combination of rising enrollments, high teacher turnover, and 
a limited supply of new teachers, arc the most promising candidates for teacher induction and hence 
for this study. 

Mach district was assigned to one of the two providers of treatment services, either Mducational 
Testing Sendee (I2TS) or New Teacher Center (NT('), based primarily on district preferences. The 
preference-based method of assigning districts to providers does not allow for and should not be 
used to make direct comparisons of one provider to the other. Obscncd differences in impacts 
between MTS anti NTC districts may be due to the programs or to the set of districts each provider 
worked with; those effects cannot be separated. The process of selecting program providers does 
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not affect the internal validity of the impact estimates, which arc computed within district and 
district type. 

Similarly, the decision of which districts would receive a second year of intervention was 
preference based. We selected the districts to receive a second year of the treatment based on 
convenience and feasibility. W’c ensured a balance of MTS and NTC districts in the two-year group. 
The non-random selection of districts means that they may differ in unobserved ways beyond 
having one or two years of treatment. Therefore, we avoid direct comparisons of one-year to two- 
year districts just as we avoided comparing ETS to NTC! districts. Again, this method of district 
selection docs not affect the impact estimates themselves. 

Table 11.1 show's the characteristics of districts included in the study. The districts served low- 
income students, with more than 40 percent of students in each district qualifying for the NSI.P. 
The study included districts with a majority of their students being African American (7 of the 17 
districts), Hispanic (2 of 17), and white (3 of 17), and 5 diverse districts without a racial/cthnic 
majority. The districts were all urban; 9 of 17 districts enrolled more than 50,000 students, and 1 1 of 
17 included more than 50 elementary schools. The districts were in three of the nation’s four Census 
regions: Northeast, Midwest, and South. 

Table 1 1.1 also shows the characteristics ol onc-ycar and two-vear districts. Seven of the one- 
year districts and two of the two-year districts had more than 50,000 students. Two of the seven 2- 
ycar districts and none of the one-year districts served a student population that was majority 
(greater than 50 percent) I Iispanic. All four of the study districts in the Midwest region were selected 
to implement the treatment for one year. Districts in the Northeast and South were part of onc-ycar 
and two-year groups. Throughout most of this report, we present findings for the onc-ycar and two- 
year districts separately. 

B. Selection of Schools and Teachers 

\\ ithin each district, cither all or a subset of elementary schools was selected for the study, 
l.argc districts exercised some discretion over the subset of schools considered for the study. 
Otherwise, we selected all schools with eligible teachers and then selected all the teachers within 
those schools who met the following eligibility criteria: 


Elementary Grade. Teachers in K— 6 were considered elementary. W’c excluded 
teachers of part-day prekindergarten classes. W’c focused on elementary rather than 
secondary schools because we needed a large number of schools per district to ensure 
feasibility of the study design. 

New to the Profession. W’c encountered 58 teachers who reported more than two years 
of teaching experience in some capacity, even if the district did not recognize such 
experience. They were included if (1) the district considered such teachers as new from 
the perspective of eligibility for beginning teacher induction services and (2) the method 
for identifying teachers lor the study was applied consistently to all schools within each 
district. 
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Table 11.1. Characteristics of Districts in Teacher Induction Sample by Length of Induction Program 




Number of Districts 

Percentage 

District Characteristics 

One 

Year 

Two Year All 

All 

Demographics 

Low Income (Percentage Eligible for NSLP) 

<65 

4 

3 

7 

41.2 

65-70 

2 

0 

2 

1 1.8 

70-75 

2 

1 

3 

17.6 

75-80 

2 

3 

5 

29.4 

>80 

0 

0 

0 

0.0 

Race/Ethnicity 

Majority African American 

4 

3 

7 

41.2 

Majority Hispanic 

0 

2 

2 

1 1.8 

Majority white 

3 

0 

3 

17.6 

No single majority group 

3 

2 

5 

29.4 

Region 

Northeast 

2 

2 

4 

23.5 

Midwest 

4 

0 

4 

23.5 

West 

0 

0 

0 

0.0 

South 

4 

5 

9 

52.9 

District Size 

Student Enrollment 

5,000-24,999 

1 

0 

1 

5.9 

25,000 49,999 

2 

5 

7 

41.2 

50,000 100,000 

4 

1 

5 

29.4 

More than 100,000 

3 

1 

4 

23.5 

Number of Elementary Schools 

Fewer than 50 

3 

3 

6 

35.3 

50-100 

2 

3 

5 

29.4 

More than 100 

5 

1 

6 

35.3 

Study Sample 

Number of Mentors 

2 

7 

4 

1 1 

64.7 

3 

2 

2 

4 

23. S 

4 

1 

0 

1 

5.9 

5 

0 

1 

1 

5.9 

Number of Sample Teachers 

25-49 

6 

2 

8 

47.1 

50-74 

2 

4 

6 

35.3 

75-100 

2 

0 

2 

1 1.8 

More than 100 

0 

1 

1 

5.9 

Sample Size (Districts) 

10 

7 

17 



Source: Mathematica analysis using the Common Core of Data 2004-2005 from the National Center for 

Education Statistics; Mathematica teacher induction survey management system. 

NSLP - National School Lunch Program. 
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• No! Already Receiving Support. Some alternative teacher preparation or certification 
programs continue to support teachers during their first year of teaching. Although 
teachers receiving such support were rare in study schools, we excluded them from the 
study in order to prevent duplication of induction services. Wc did, however, include 
teachers in alternative certification programs who were not receiving induction services 
from their programs. 


Wc ultimately included 418 elementary schools in the study across the 17 districts. Tables 11.2 
and 11.3 show the percentages of schools in one- and two-year districts serving low-income and 
minority students, as well as the grade configurations of the schools. Most of the schools in both 
types of districts employed one, two, or three eligible beginning teachers. 


Table 11.2. School Characteristics in One-Year Districts by Treatment Status (Percentages) 


School Characteristic 

All 

Schools 

Treatment 

Control 

Difference 

Pvalue 

Percent Eligible for NSLP 

<50% 

8.S 

9.3 

7.8 

1.5 

0.592 

50-75% 

23.7 

21.0 

26.4 

•5.4 


75-100% 

67.8 

69.7 

65.8 

3.9 


Race/Ethnicity 

Majority African American 

43.8 

43.3 

44.3 

•1.0 

0.863 

Majority Hispanic 

13.9 

15.7 

12.1 

3.6 


Majority white 

23.4 

22.1 

24.6 

•2.5 


Other/mixed 

18.9 

18.9 

19.0 

•0.1 


Grade Configuration 

Pre K to 5 or K to 5 

64.4 

65.5 

63.4 

2.1 

0.907 

Pre-K to 8 or K to 8 

26.4 

26.1 

26.7 

0.7 


Other 

9.2 

8.4 

9.9 

•1.5 


Number of Sample Teachers 

1 

41.6 

39.3 

43.8 

•4.5 

0.270 

2 

23.3 

23.8 

22.8 

1.0 


3 

20.4 

23.0 

17.8 

5.2 


4 

6.1 

8.2 

4.1 

4.1 


More than 4 

8.6 

5.7 

11.6 

5.8 


Sample Size (Schools) 

252 

124 

128 




Source: Mathematics analysis using the Common Core of Data 2004-2005 from the National Center for 

Education Statistics. 

Note: Data are weighted to account for the study design. Significance tests for categorical variables are 

design-adjusted F-tests of the difference in distributions. None of the differences is statistically 
significant at the 0.05 level. 

NSLP = National School Lunch Program. 
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Table 11.3. School Characteristics in Two-Year Districts by Treatment Status (Percentages) 


School Characteristic 

All 

Schools 

Treatment 

Control 

Difference 

Pvalue 

Percent Eligible for NSLP 

<50% 

8.7 

11.1 

6.2 

4.9 

0.365 

SO- 75% 

19.3 

15.4 

23.4 

8.0 


75-100% 

72.0 

73.5 

70.4 

3.1 


Race/Ethnicity 

Majority African American 

44.7 

44.6 

44.8 

0.2 

0.383 

Majority Hispanic 

33.8 

37.8 

29.6 

8.2 


Majority white 

6.7 

7.2 

6.2 

1.1 


Other/mixed 

14.8 

10.3 

19.4 

9.1 


Grade Configuration 

Pre K to 5 or K to 5 

81.3 

84.0 

78.6 

5.4 

0.662 

Pre K to 8 or K to 8 

1 1.5 

9.5 

13.6 

-4.1 


Other 

7.2 

6.5 

7.8 

•1.3 


Number of Sample Teachers 

1 

32.1 

29.9 

34.3 

4.4 

0.695 

2 

24.9 

27.8 

22.0 

5.9 


3 

14.7 

17.3 

12.1 

5.2 


4 

12.5 

1 1.4 

13.7 

•2.3 


More than 4 

15.7 

13.5 

17.9 

•4.4 


Sample Size (Schools) 

166 

86 

80 




Source: Mathematics analysis using the Common Core of Data 2004-2005 from the National Center for 

Education Statistics. 

Note: Data are weighted to account for the study design. Significance tests for categorical variables are 

design-adjusted F-tests of the difference in distributions. 

None of the differences is statistically significant at the 0.05 level. 

NSLP = National School Lunch Program. 


C. Random Assignment of Schools to Treatment 

The defining feature of the study is the random assignment of schools to a treatment group that 
received the comprehensive induction services or to a control group that received the prevailing 
induction services provided by the district. Given the large sample, we can attribute the differences 
in average outcomes between the two groups to the addition of comprehensive induction services, 
ruling out all other confounding factors. 

1. Method of Random Assignment 

Fligible teachers in a school were either all exposed or all not exposed to treatment, a method 
known as cluster random assignment. Cluster random assignment was necessary because varying the 
types of induction sendees available in the same school building could result in contamination 
between sendees. For example, a mentor might feel uncomfortable being told not to provide any 
assistance to the colleague of one of his or her beginning teachers it the colleague was struggling 
with a problem. Furthermore, the presence of a mentor in the building could affect how existing 
supports and other resources are distributed among faculty at that school. Therefore, we assigned all 
eligible teachers within a school to treatment or control status based on the school in which they 
were expected to teach at the point of random assignment (baseline). 


12 







II. Shut) Of sign and Mtthodi 


To increase statistical precision, we used block random assignment, with school districts as 
blocks. In other words, we conducted random assignment of schools within districts to ensure that 
each district was represented equally in both groups and that treatment status was not confounded 
with the school district. Block random assignment took into account the considerable variation 
among districts in the policies, student populations, and environments that could affect the study’s 
outcomes. 

Within districts, we used an efficient randomization technique called constrained minimisation. For 
each district, we listed all admissible allocations of schools to treatment and control groups, and we 
randomly selected one allocation, with each allocation having an equal probability of selection. The 
admissible allocations were those that achieved an appropriate degree of balance between the 
treatment and control groups in terms of the overall number of eligible teachers and teaching 
assignment (grade level). Because the admissible allocations were defined independently of treatment 
status, every school and every teacher had a 50 percent probability of assignment to the treatment 
group. Glazerman et al. (2005) provide details on this random assignment method. 

2. Treatment-Control Balance at Baseline 

Random assignment produced groups that were equivalent on a wide variety of measures. 
Tables 11.2 to 11.11 describe the sample of schools and teachers along the dimensions measured, 
presenting the average characteristics separately by treatment status. The treatment and control 
schools exhibited similar percentages of low-income students and minority students, as shown in 
Tables 1 1.2 and 1 1.3. 

While teachers were randomized indirectly, via their schools, the treatment and control teachers 
were similar in terms of demographic characteristics. Tables 11.4 and 11.5 present demographic 
characteristics of the study teachers by treatment group from onc-vcar and two-year districts, 
respectively. Of 532 teachers in one-year districts responding to the baseline survey, similar 
percentages of treatment and control group members were white (74 and 77 percent, respectively), 
female (86 anil 88 percent), under age 25 (51 and 49 percent), married (47 and 45 percent), and had 
no children at home (74 and 75 percent). Of 421 teachers in two-year districts responding to the 
baseline survey, similar percentages of treatment and control group members were white (43 and 
44 percent, respectively), female (89 and 91 percent), under age 25 (48 and 47 percent), married 
(43 percent for both groups), and had no children at home (66 and 63 percent). 

Treatment and control teachers were similar in terms of most professional characteristics, with 
sonic exceptions. Tables 11.6 through 11.9 describe the professional backgrounds of teachers for the 
one-year and two-year districts, respectively. In one-year districts, similar percentages of treatment 
and control teachers had advanced degrees, earned bachelor’s degrees from highly selective 
colleges", had an education degree, and entered the profession with no student teaching (Table 11.6). 
There was a statistically significant difference in how the teachers in one-year districts entered the 
profession, with a higher percentage of treatment teachers coming from a traditional four-year 
program (62 versus 56 percent) and a lower percentage of treatment teachers entering through an 
alternative preparation program (13 versus 22 percent). There was also a statistically significant 

6 A ‘'highly selective” college or university is one that is rated as “most competitive,” “highly competitive,” or “very 
competitive” by the 2003 edition of the Harron's Profitr of Amman C.oitfgri. 
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difference in the type of teaching certificate held, with a higher percentage tit" treatment teachers 
holding a regular certificate (7(1 versus 60 percent) and a lower percentage of treatment teachers 
holding a probationary certificate (23 versus 36 percent). For those teachers who gave us permission 
to obtain their SAT or ACT score and ftir whom scores were available, we found no statistically 
significant differences in scores between the treatment and control teachers (Table 11.7). With two- 
year districts, none of the differences between treatment and control teachers in professional 
background characteristics were statistically significant (Tables 11.8 and 11.9). 


Table 11.4. Teacher Demographic Characteristics by Treatment Status (Percentages): One-Year Districts 



All 



Differenc 


Teacher Characteristics 

Teachers 

Treatment 

Control 

e 

P value 

Gender 





0.519 

Male 

12.6 

13.6 

11.6 

2.0 


Female 

87.4 

86.4 

88.4 

2.0 


Race/Ethnicity 





0.585 

White, non-Hispanic 

75. S 

74.1 

77.0 

2.9 


African American, non Hispanic 

14.0 

15.1 

13.0 

2.1 


Hispanic 

5.S 

4.8 

6.2 

•1.4 


Other/mixed/unknown 

S.O 

6.0 

3.9 

2.2 


Age (Years)' 





0.902 

20-25 

49.8 

50.5 

49.1 

1.4 


26-29 

19.S 

18.2 

20.8 

2.6 


30-39 

18.9 

19.6 

18.2 

1.4 


40 or older 

11.8 

11.7 

11.9 

0.1 


Marital Status 





0.685 

Married or living with a partner 

45.7 

46.6 

44.6 

2.0 


Single, separated, divorced, or 
widowed 

54.3 

53.4 

55.4 

2.0 


Children Living in the Home 





0.713 

None 

74.5 

73.9 

75.1 

•1.2 


One or more children younger 
than age 5 

10.4 

11.5 

9.3 

2.2 


One or more children, none 
younger than age 5 

15.1 

14.6 

15.6 

•1.0 


Sample Size (Teachers) 

532 

267 

265 




Source: Mathematics Teacher Background Survey administered m fall 2005 to all study teachers. 

Note: Data are weighted to account for the study design. Significance tests for categorical variables are 

design-adjusted F-tests of the difference in distributions. 

'Age of teacher is measured as of December 31, 2005, during the school year in which the study 
began. 

None of the differences is statistically significant at the 0.05 level. 
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Table 11.5. Teacher Demographic Characteristics by Treatment Status (Percentages): Two-Year Districts 


Teacher Characteristics 

All Teachers 

Treatment 

Control 

Differenc 

e 

P value 

Gender 





0.604 

Male 

10.1 

10.9 

9.3 

1.6 


Female 

89.9 

89.1 

90.7 

•1.6 


Race/Ethnicity 





0.382 

White, non Hispanic 

43.5 

42.8 

44.3 

•1.5 


African American, non Hispanic 

25.5 

29.5 

21.4 

8.1 


Hispanic 

27.1 

23.5 

31.0 

•7.5 


Other/mixed/unknown 

3.8 

4.3 

3.3 

0.9 


Age (Years)' 





0.388 

20-25 

47.4 

47.5 

47.3 

0.2 


26-29 

20.0 

20.9 

19.0 

1.8 


30-39 

21.3 

18.2 

24.5 

6.3 


40 or older 

1 1.4 

13.5 

9.2 

4.3 


Marital Status 





0.910 

Married or living with a partner 

43.1 

43.4 

42.8 

0.6 


Single, separated, divorced, or 
widowed 

56.9 

56.6 

57.2 

0.6 


Children Living in the Home 





0.807 

None 

64.5 

65.7 

63.4 

2.3 


One or more children younger 
than age 5 

19.7 

19.8 

19.7 

0.1 


One or more children, none 
younger than age 5 

15.7 

14.6 

16.9 

•2.3 


Sample Size (Teachers) 

421 

222 

■a 




Source: Mathemabca Teacher Background Survey administered m faB 2005 to all study teachers. 

Note: Data are weighted to account for the study design. Significance tests for categorical variables are 

design-adjusted F-tests of the difference in distributions. 

'Age of teacher is measured as of December 31, 2005, during the school year in which the study 
began. 

None of the differences is statistically significant at the 0.05 level. 
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Table 11.6. Teacher Professional Background by Treatment Status (Percentages): One-Year Districts 



All 





Teacher Characteristics 

Teachers 

Treatment 

Control 

Difference 

P-value 

Has Master’s or Doctoral 

Degree 

26.3 

24.0 

28.7 

-4.7 

0.289 

Earned a Bachelor's Degree 
from a Highly Selective College 

30.6 

31.2 

30.0 

1.2 

0.790 

Earned a Degree with 

Education Related Major or 

Minor 

77.7 

76.8 

78.5 

*1.7 

0.680 

How Entered the Profession 





0.048’ 

Traditional program (four- 
year) 

59. 1 

62.4 

55.7 

6.7 


Traditional program 
(post baccalaureate) 

22.6 

22.9 

22.4 

0.5 


Teach for America 

0.7 

l.S 

0.0 

1.5 


Other alternative preparation 
program or unknown 

17.5 

13.3 

21.9 

8.6 


Career Changer 

13.3 

12.9 

13.9 

-1.0 

0.731 

Teaching Certificate 





0.009’ 

Regular 

64.8 

69.8 

59.5 

10.3 


Probationary 

29.4 

23.3 

36.0 

•12.6 


Emergency/waiver/other 

S.7 

6.8 

4.5 

2.3 


Weeks of Student Teaching 





0.277 

Zero 

13.7 

12.0 

15.5 

*3.5 


1-12 

20.0 

19.3 

20.7 

-1.5 


13-16 

38.2 

36.8 

39.6 

•2.7 


1 7 or more 

28.2 

31.9 

24.2 

7.7 


Sample Size (Teachers) 

532 

267 

265 




Source: Mathematics Teacher Background Survey administered m fall 2005 to all study teachers. 

Note: Data are weighted to account for the study design. Significance tests for categorical variables are 

design-adjusted F-tests of the difference in distributions. 

•Significantly different from zero at the 0.05 level. 
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Table 11.7. Teacher College Entrance Exams by Treatment Status: One-Year Districts 


Teacher Characteristics 

All 

Teachers 

Treatment 

Control 

Diffe rence 

P value 

College Entrance Exam Scores 
(Percentages) 

Did not take exam 

Did not consent to obtain scores 
Scores not found 

Scores reported 

8.9 

19.3 

10.6 

61.2 

8.3 

16.6 

13.6 

61.4 

9.5 

22.2 

7.5 

60.8 

■1.2 

•5.6 

6.2 

0.6 

0.109 

SAT Combined Score (or ACT 
Equivalent) 

1030 

1033 

1028 

S 

0.789 

Sample Size 
(All Teachers) 

561 

27 S 

286 



Sample Size (Teachers 

with Usable ACT or SAT Scores) 

327 

164 

163 




Source: Mathemabca analysis using data (rom the CoBege Board and ACT, Inc. 

Note: ACT scores were converted to SAT score equrvalents using concordance tables in Dorans et al. (1997). 

Significance tests tor categorical variables are design-adjusted F-tests of the difference in distributions. 

None of the differences is statistically significant at the 0.05 level. 

Statistically significant differences were found between treatment and control groups in 
teachers’ assignments. For both the one-year and two-year districts, a smaller percentage of control 
than treatment teachers said that they were responsible tor reading outcomes (86 percent of control 
teachers versus 92 percent of treatment teachers in the one-year districts, and 78 percent of control 
teachers versus 90 percent of treatment teachers in the two-year districts, as shown in Tables II. 10 
and 11.11). The control group in the two-year districts contained a higher percentage of subject 
teachers than did the treatment group (12 versus 3 percent). Subject teachers included those who 
taught a single core subject such as math or science, as well as those who taught subjects such as art 
and music. This could mean that the process for identifying eligible teachers worked differently in 
the treatment and control schools, although non-classroom (including special subject) teachers were 
automatically excluded from the student test score analyses. The special subject teachers were 
included in the analysis of induction sendees received, teacher attitudes, and retention in order to 
measure outcomes for all teachers to whom districts would normally provide comprehensive 
induction sendees. The findings were robust to the inclusion or exclusion of special subject teachers. 

3. Integrity of the Random Assignment Design 

A randomized trial is the strongest evaluation design for identifying causal relationships, but 
even randomized experiments arc subject to threats that can undercut a researcher’s ability to draw 
inferences about the effectiveness of the intcn'cntion. \X c examined two typical threats to random 
assignment studies — noncompliance and attrition (study dropouts) — and found that these issues 
were not sufficiently serious to undermine the integrity of the study’s findings. 
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Table 11.8. Teacher Professional Background by Treatment Status (Percentages): Two-Year Districts 


Teacher Characteristics 

All 

Teachers 

Treatment 

Control 

Difference 

P-value 

Has Master's or Doctoral Degree 

1 5.9 

16.2 

15.7 

0.5 

0.915 

Earned a Bachelor’s Degree 

28.8 

30.0 

27.5 

2.6 

0.565 

from a Highly Selective College 

Earned a Degree with Education- 

64.6 

63.6 

65.7 

•2.1 

0.689 

Related Major or Minor 

How Entered the Profession 

Traditional program (four-year) 

61 .5 

59.3 

63.7 

•4.4 

0.395 

Traditional program 

9.2 

7.8 

10.6 

•2.7 


(post-baccalaureate) 

Teach for America 

6.2 

5.7 

6.6 

0.8 


Other alternative preparation 

23.2 

27.1 

19.1 

8.0 


program/unknown 

Career Changer 

14.9 

15.9 

13.9 

2.0 

0.597 

Teaching Certificate 

Regular 

50.4 

49.5 

51.3 

•1.7 

0.892 

Probationary 

41.9 

42.1 

41.7 

0.4 


Emergency/waiver/other 

7.7 

8.4 

7.1 

1.3 


Weeks of Student Teaching 

Zero 

28.S 

30.6 

26.2 

4.4 

0.445 

1-12 

18.3 

16.2 

20.5 

4.2 


13-16 

34.6 

36.8 

32.3 

4.5 


1 7 or more 

18.6 

16.3 

21.0 

•4.7 


Sample Size (Teachers) 

421 

222 

199 




Source: Mathematics Teacher Background Survey administered m fall 2005 to all study teachers. 

Note: Data are weighted to account for the study design. Significance tests for categorical variables are 

design-adjusted F-tests of the difference in distributions. 

None of the differences is statistically significant at the 0.05 level. 
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Table 11.9. Teacher College Entrance Exams by Treatment Status: Two-Year Districts 


Teacher Characteristics 

All 

Teachers 

Treatment 

Control 

Difference 

P value 

College Entrance Exam Scores 
(Pe rcentages) 

Did not take exam 

Did not consent to obtain scores 
Scores not found 

Scores reported 

14.3 

22.7 

11.6 

51. 4 

13.0 

23.4 

12.3 

SI .2 

15.6 

22.0 

10.9 

51.6 

■2.6 

1.5 

1.5 

0.3 

0.891 

SAT Combined Score (or ACT 
Equivalent) 

975 

961 

990 

30 

0.287 

Sample Size (All Teachers) 

448 

231 

217 



Sample Size (Teachers 

with usable ACT or SAT Scores) 

221 

117 

104 




Source: Mathematica analysis using data (rom the Co»ege Board and ACT, Inc. 

Note: ACT scores were converted to SAT score equivalents using concordance tables in Dorans et al. (1997). 

Significance tests for categorical variables are design-adjusted F-tests of the difference in distributions. 

None of the differences is statistically significant at the 0.05 level. 

Noncompliance. Noncompliance with treatment assignment — a concern in randomized 
experiments in which subjects in the control group receive treatment services or subjects in the 
treatment group fail to take up treatment (Angrist ct al. 1996) — was not a serious problem in the 
teacher induction study. We put several safeguards in place to document teachers’ compliance with 
treatment assignment and districts’ cooperation with program implementation. First, an induction 
activities survey, administered twice during the implementation year, allowed us to measure the 
induction services each sample member received. Second, researchers from WestEd, a subcontractor 
to Mathematica, monitored implementation of the comprehensive induction services and fidelity to 
the induction model hv collecting information on attendance at program activities and watching for 
services that might have been extended to teachers in schools not randomly assigned to the 
treatment group. Third, we monitored program mentor interactions via program logs and teacher 
mobility using field reports that were filed in a tracking system to complement the survey data on 
teacher mobility. Collectively, these data sources yielded a complete picture of service receipt. 

The main form of noncompliance — “crossover” resulting from control group members’ receipt 
of treatment — was not a problem. W’c designed the study to avoid contamination within the school 
and found limited mobility between school types (control to treatment or vice versa) during the 
school year. We identified fewer than three teachers out of more than 1,000 who transferred from a 
control to a treatment school and received services. 
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Table 11.10. Teaching Assignments by Treatment Status (Percentages): One-Year Districts 


Teacher Characteristics 

All 

Teachers 

Treatment 

Control 

Difference 

P value 

Grade Level 

Kindergarten 

13.6 

12.7 

14.6 

•1.8 

0.151 

Grade 1 

15.2 

14.2 

16.2 

•1.9 


Grade 2 

14.4 

16.9 

1 1.8 

5.0 


Grade 3 

13.2 

15.3 

10.9 

4.4 


Grade 4 

12.9 

14.5 

1 1.1 

3.4 


Grade 5 

10.0 

8.4 

1 1.6 

•3.2 


Multiple, other 

20.8 

17.9 

23.8 

•5.9 


Responsible for Reading 

89.3 

92.2 

86.2 

6.0' 

0.034 

Outcomes 

Responsible for Mathematics 

91.0 

93.0 

88.9 

4.1 

0.1 10 

Outcomes 

Subject Specialty" 

Teaches only one grade level 

82.0 

85.3 

78.5 

6.7 

0.104 

Specialist: bilingual, ESL. or ELL 
Specialist: special education 

> 

7.5 

> 

5.7 

> 

9.4 

•3.7 

0.142 

Specialist: core academic or 

4.9 

3.9 

6.0 

•2.1 

0.288 

other subject (e.g., reading, 
social studies, mathematics, 
science, computers, foreign 
language, art, music, gym) 

Teaching in Preferred Grade and 

79.6 

81.6 

77.6 

4.0 

0.138 

Subject 

Sample Size (Teachers) 

532 

267 

265 




Source: Mathematica Teacher Background Survey administered in fafl 2005 to all study teachers. 

Note: Data are weighted to account for the study design. Significance tests for categorical variables are 

design-adjusted F-tests of the difference in distributions. 

"Subject specialty variables are not exhaustive or mutually exclusive. In this table, a "specialist” is 
someone who does not teach just one grade level. 

"Exact value suppressed to protect respondent confidentiality. 

'Significantly different from zero at the 0.05 level. 

ESL - English as a Second Language: ELL = English Language Learner. 
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Table 11.11. Teaching Assignments by Treatment Status (Percentages): Two-Year Districts 


Teacher Characteristics 

All 

Teachers 

Treatment 

Control 

Difference 

P value 

Grade Level 





0.151 

Kindergarten 

18.3 

19.5 

17.1 

2.4 


Grade 1 

14.4 

14.4 

14.4 

0.0 


Grade 2 

16.3 

17.4 

15.1 

2.2 


Grade 3 

13.6 

13.7 

1 3.S 

0.2 


Grade 4 

9.9 

9.8 

10.1 

0.3 


Grade 5 

7.9 

8.9 

6.9 

2.0 


Multiple, other 

20.8 

17.9 

23.8 

5.9 


Responsible for Reading Outcomes 

84.4 

90.3 

78.2 

12.1* 

0.003 

Responsible for Mathematics 
Outcomes 

83.3 

86.4 

80.1 

6.3 

0.092 

Subject Specialty* 

Teaches only one grade level 

82.9 

85.4 

80.3 

S.l 

0.209 

Specialist: bilingual, ESL, or ELL 

1.7 

1.7 

1.7 

0.0 

0.99 S 

Specialist: special education 

5.3 

6.6 

4.0 

2.6 

0.301 

Specialist: core academic or other 
subject (e.g_ reading, social 
studies, mathematics, science, 
computers, foreign language, art, 
music, gym) 

7.5 

3.4 

1 1.8 

8.4* 

0.003 

Teaching in Preferred Grade and 
Subject 

78.4 

78.7 

78.1 

0.7 

0.876 

Sample Size (Teachers) 

421 

222 

199 




Source: Mathematica Teacher Background Survey administered in fall 200S to all study teachers. 

Note: Data are weighted to account for the study design. Significance tests for categorical 

variables are design-adjusted F-tests of the difference in distributions. 

'Subject specialty variables are not exhaustive or mutually exclusive. In this table, a "specialist” is 
someone who does not teach just one grade level. 

‘Significantly different from zero at the 0.05 level. 

ESL = English as a Second Language; ELL = English Language Learner. 


The second form of noncompliance — “no-shows” resulting from treatment group members 
failing to adopt the treatment — did not occur frequently. \X c did sec some treatment group teachers 
refusing induction sendees or transferring to schools in which the induction sendees would not he 
available (for example, if they left the district). Nine treatment schools representing 12 teachers in 
one district and 3 teachers in another district refused to implement the comprehensive induction 
sendees that were offered. These 15 teachers made up 3 percent ol the treatment group. The degree 
of program dropout is discussed in Chapter IV. All sample members are included in the impact 
analysis regardless of compliance status and classified according to their school's original treatment 
assignment. 

Nonresponse and Study Attrition. Nonresponse and study attrition, especially differential 
attrition hv treatment status, is another issue that affects the quality of any randomized experiment 
(or any longitudinal study regardless of design). For this study, response rates were at least 
88 percent for the full sample on all major surveys in year 1 of the study, at least 83 percent in year 
2, and at least 85 percent in year 3 (see Chapter 111, Table 111.1), yet we observed differences in 
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response rales by treatment status that were statistically significant. For example, the control group 
response rate for the spring 20(16 induction activities questionnaire was 83 percent and the 
corresponding treatment group rate was 93 percent. A concern with differential response rates is 
that if nonresponse is not random with respect to outcomes, then the degree to which nonresponse 
affects the average outcomes will differ by treatment status, and the impact estimates — which arc 
differences in mean outcomes for respondents only — will he biased. If, for example, 
nonrespondents have worse outcomes than do respondents, we would expect the lower response 
rates for the control group to translate into an upwardly biased estimate of the countcrfactual 
outcome and therefore a downwardly biased estimate of the impact. 

To mitigate such an outcome, we constructed nonresponse adjustment weights. Such weights 
let the respondents within each treatment group who look most like nonrespondents earn - a greater 
weight so that they can stand in for their missing counterparts. We adjusted the weights to account 
for the variations in design implementation across districts. A full discussion of weights is included 
in Appendix A. We used these weights in the impact estimation, although the weights did not 
substantially change the findings. 

D. Impact Estimation 

The goal of the impact analysis is to estimate the effect of comprehensive teacher induction on 
a range of teacher outcomes relative to those that would have been observed in the absence of the 
comprehensive program. To that end, we examined whether student achievement gains, teacher 
mobility patterns, and other outcomes for teachers randomly assigned to the receipt of 
comprehensive induction services differed from the outcomes for those we assigned to the receipt 
of the prevailing induction services offered by the district. 

Appendix A details the methods used for estimating the impacts of the comprehensive 
induction programs, as well as the alternate estimation approaches we used for testing the 
robustness of the study’s findings. We illustrate the effect of alternate approaches by using a 
benchmark model that imposes the most reasonable set of assumptions and measurement rules and 
then comparing it to a set of alternatives that implement deviations — one at a time — from that 
benchmark. For example, the benchmark model specifics a set of variables used as covariatcs for 
regression adjustment ot the impact estimates. The set of benchmark covariates differs for each 
outcome. 

One virtue of random assignment is its analytic simplicity. The difference between the average 
outcome for the treatment and control groups is an unbiased estimate of the impact of the treatment 
on any outcome of interest. A /-test of the difference in average outcomes enables the evaluator to 
assess whether the observed ditfercnce could have been attributable to chance or to the program. 

In the ease of the teacher induction experiment, the hypothesis tests must he constructed in a 
way that is consistent with the study design. Specifically, we must account for the fact that we 
randomly assigned schools, rather than individual teachers, ter treatment groups. Recognizing that 
teachers from the same school share the same principal, school culture, building conditions, 
neighborhood, and other characteristics that might affect teacher outcomes, we cannot treat teachers 
in the same school as independent observations. 

Therefore, we use a model-based approach to estimate program impacts. The statistical model 
not only allows us to represent the non-independence of observations explicitly, it also allows us to 
exploit the data on student, teacher, and school background characteristics to increase the precision 


22 



II. Stud) Design and Methods 


of the estimates of treatment effects. The regression model allows us to control for the effects of a 
range of background characteristics, not just treatment status, on the outcomes of interest. By 
accounting for the many variables that affect teacher retention, for example, we can reduce the 
amount of unexplained variation in mobility decisions and thereby increase our confidence in the 
estimates of treatment clfccts. 

The other advantage of the regression model is its ability to acknowledge the hierarchical 
structure of the data — for example, the nesting of teachers within schools. Accordingly, the units of 
analysis can be properly specified, and unbiased estimates of the standard errors used to conduct 
hypothesis tests can be devised. Although the study defines outcomes at the teacher level, we 
performed random assignment at the school level; hence, the regression model must account for the 
clustering of teachers within schools. Appendix A describes the statistical methods in more detail. 

Impact findings arc presented in two ways in this report, hirst, we present them as differences 
between the (regression-adjusted) means or percentages for the treatment and control groups. 
Second, for continuous outcome variables, we present the impact as an effect size, defined as the 
fraction of a standard deviation ol the outcome variable. Effect sizes arc a common metric used to 
compare findings across studies that rely on different measurement instruments. Effect sizes arc 
computed as the impact divided by the standard deviation of the outcome variable. The standard 
deviation is computed using the full sample (treatment and control groups). 

E. Interpreting Impact Estimates When There Are Multiple Comparisons 

To interpret the impact estimates, this report relies on conventional notions of statistical 
significance. To determine if an impact estimate represents a true effect of the treatment or just a 
chance difference between the treatment and control groups, we conduct a statistical hypothesis test. 
The effect is deemed statistically significant if the probability ol observing a difference (the “p- 
valuc") in the absence of a true impact is less than 5 percent. In other words, there is a 5 percent 
chance of “Type I error,” declaring a finding to be statistically significant when the treatment was 
not responsible lor the cflcct. 

Using these rules, the probability of committing a Type 1 error is always 5 percent for any one 
test, but as the number of tests increases, the chance of committing at least one such error rises, 
leading to what is known as the multiple comparison problem — the risk of ignoring a large number 
ol nonsignificant results and regarding one or two statistically significant results as true impacts. 

There are many solutions to this problem, but we discuss two here. One solution, which we 
followed in this report, is to note the number of non-significant findings when reporting on 
significant findings, so the reader has the appropriate context. For example, it would be 
inappropriate to suppress non-significant findings from a table without at least noting that the 
additional tests were conducted. 

Another set of solutions includes formalized approaches to controlling the famitf-u'ise Type I 
error rate, which is the probability of making a single Type I error in a group ol hypothesis tests, or 
that try to control the False Discovery Rate (FDR), which is the percentage of tests that result in a 
Type 1 error. One such tormalized approach that we considered for this report is an FDR control 
procedure developed by Benjamini and 1 Iochbcrg (1995). The method calls for rank-ordering the 
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tests by their p-valuc from lowest to highest and determining a cutoff p-valuc above which all of the 
findings are deemed statistically insignificant, even if their individual p-valucs may fall below 0.05. 


This report used the first approach of contextualizing the findings and did not present any 
adjustments based on the Benjamini-Hochbcrg (Bll) method because such adjustments were 
unnecessary' or inappropriate (see Iscnberg ct al. 2009 for a fuller discussion). There was only one 
impact estimate in this report where the method could have been appropriately applied and a 
different test result would have been reached. In Table VI. 1 a significant negative impact on the 
prior literacy lesson content score of teachers who remained in the treatment group versus those in 
the control group would not be regarded as significant after applying a Bll adjustment. In the text, 
we discuss the non-significant findings and do not draw conclusions based on this finding out of 
context. 


' This cutoff is determined to be the last test in the list, rank ordered from lowest to highest p value, for which the 
test’s p value is less than 0.05'(i/m), where i is the rank and m is the number of tests being conducted. 
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III. DATA 

In accordance with the conceptual framework presented in Chapter I, we collected detailed data 
on teacher induction services, outcomes, and contextual factors. The data collection effort was most 
intense during the 2005—2006 school year, while the comprehensive induction programs were being 
implemented in the treatment schools in all districts, and continued for an additional three years. At 
the start of the programs, we surveyed mentors on their background characteristics and reviewed 
program documents Irom the Educational Testing Service (ETS) and the New Teacher Center 
(NTC). We administered a background teacher survey in fall 2005, at which time we also requested 
teachers' permission to obtain their college entrance exam scores (SAT or AC7T). Surveys of teacher 
induction activities were administered to both treatment and control teachers during all four years of 
the study (2005—2006 through 2008—2009). For the study's core outcomes, we observed classr<x)ms 
in spring 2006, collected the districts' student records data following the 2005-2006, 2006-2007, and 
2007-2008 school years, and conducted teacher mobility surveys in fall 2006, 2007, and 2008 to 
learn about teacher retention. Figure 1 11.1 shows a timeline for the data collection activities. 

This report presents findings pertaining to all four years of data collection, both lor the set of 
districts that received one year of treatment and for those that received two years of treatment. 
Response rates and brief descriptions of each data collection activity arc provided below. Copies of 
the survey instruments may be found in Glazcrman et al. (2005). At the end of this chapter, we 
present flow diagrams that explain how we used the data we collected to derive our analysis samples 
from the pool of teachers we originally identified as eligible for the study. Figure 1 1 1.2 shows a flow 
diagram for one-year districts, and Figure 111.3 shows a similar diagram for two-year districts. 

A. Mentor Survey 

As part of the treatment intervention, ETS and NTC worked with district staff to hire 
44 mentors who would deliver the intervention services, offering support and guidance to help 
beginning teachers use evidence Irom their own practice to recognize and implement effective 
instruction. The mentor hiring and duties arc described in Chapter IV. 

During the IvTS and NTC! mentor-training sessions in fall 2005, we surveyed all 44 mentors on 
their previous mentoring experience, professional background, and basic demographic 
characteristics. All of these factors may influence the effect of mentor training on the mentor’s 
practice and, in turn, the effect of mentoring on outcomes for beginning teachers. The survey was a 
self-administered, papcr-and-pcncil questionnaire. 

B. Beginning Teacher Surveys 

1. Teacher Background Survey 

Starting in October 2005, we administered a baseline survey to the treatment and control 
teachers to gather detailed information about their professional backgrounds, current teaching 
assignments, and demographic characteristics. The survey addressed teachers' professional 
credentials, participation in teacher preparation programs, perceptions of the teaching profession, 
and personal background characteristics, many of which (marital status, spouse’s occupation and 
relocation history, number of young children, and salary at the start of the first year) are 
hypothesized to allcct career decisions and hence retention. We mailed the surveys to all sample 
members at their sch<x»ls and followed up by telephone and in person. Although most surveys were 
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returned in laic 2005, we continued to follow up with sample members throughout the school year 
in order to achieve a final response rate of 94 percent (92 percent of control group teachers and 97 
percent of treatment group teachers). 

Figure 111.1. Timing of Data Collection 

2005- 2006 School Year 

Study Year I ful Aug Sep Oct Nov Dec fan feb Mar Apr May Am 

Random Assignment 

Mentor Background Survey 

Teacher Background Survey and Consent 
for SAT/ACT Scores 

Induction Activities Survey, rounds I and 2 

Classroom Observation ™ 

2006 2007 School Year 

Study Year 2 ful Aug Sep Oct Nov Dec fan Feb Mar Apr May A>n 

Induction Activities Survey, rounds 3 and 4‘ 

Mobility Survey, round I 
School Records, round I 

2007- 2008 School Year 

Study Year i ful Aug Sep Oct Nov Dec fan Feb Mar Apr May fun 

Induction Activities Survey, round 5 
Mobility Survey, round 2 
School Records, round 2 

2008 2009 School Year 

Study Year 4 ful Aug Sep Oct Nov Dec fan Feb Mar Apr May Jun 

Induction Activities Survey, round 6 
Mobility Survey, round 3 
School Records, round 3 


* In spring 2007, the Induction Activities Survey was administered only to teachers in the 7 two-year 
districts. 

One component of this background survey was a consent form asking teachers to permit the 
research team to obtain their college entrance exam scores, either SAT or ACT. These scores, which 
we received from 52 percent of teachers, provide an objective measure of a teacher’s cognitive 
ability before they received any special preparation to enter the profession. Such a measure is useful 
as a potential correlate for teacher effectiveness or a description of the types of teachers who choose 
to stay in or leave the teaching profession (Ferguson and 1 .add 1996; Grccnwald ct al. 1996). 
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2. Induction Activities Survey 

It was important to understand the differences in the services delivered by the comprehensive 
and prevailing programs, and to investigate teachers' participation in induction activities alter 
treatment ended. Treatment teachers in one-year and two-year districts were offered the same usual 
services as control teachers following the conclusion of the intervention. Our post-intervention data 
can show whether the intervention induced future changes in treatment teachers’ usage of these 
sendees beyond what it would have been in the absence of the intervention. To that end, we 
administered a survey of teacher induction activities to both treatment and control teachers twice 
during the 2005-2006 school year, and again in fall 2006, fall 2007, and fall 2008." Teachers in the 
seven districts that received two years of comprehensive teacher induction were surveyed an 
additional time during spring 2007 to gather more in-depth information about the induction 
activities in which they participated. Given that the nature of induction activities may change often 
during the school year, the administration of multiple surveys reduced any difficulties teachers might 
have had in recalling the activities over the course of the study, allowing us to detect changes over 
time in the types and intensity of services, such as the amount of time spent in mentor meetings or 
the number of times that administrators observed teachers in the classroom. The current report 
presents the findings from the induction activities surveys administered at all six time points (fall 
2005, spring 2006, fall 2006, spring 2007, fall 2007, and fall 2008). 

These surveys included questions applicable to services delivered by both the comprehensive 
and prevailing programs. The survey asked questions about mentoring from any source, timing and 
duration of mentor interactions, other induction activities such as classroom observations, 
professional development workshops, feedback on instructional practices, and the extent to which 
respondents arc satisfied with various aspects of teaching. We mailed the surveys and followed up by 
telephone and in some eases used field interviewers to complete the survey in person to achieve a 
high response rate. 

3. Teacher Mobility Survey 

We sent mobility surveys to all teachers in fall 2006, fall 2007, anti fall 2008 to track their career 
progress — whether they returned to teaching and, if so, whether they returned to the same school or 
district. For those who left teaching, we asked about the circumstances, reasons, and timing of the 
change as well as about their current employment status and plans for returning (if applicable). For 
example, we asked about job responsibilities and salary for those who had changed jobs. As with the 
other teacher surveys, the mobility surveys were self-administered, mail questionnaires with 
telephone and in-person follow-up interviews for those who did not complete the instrument 
by mail. 


'The fall 2005 and spring 2006 induction activities surveys were administered over a period that stretched front 
November to early March and late March to June, respectively, iatrge shares of the surveys were returned in January and 
March (28 percent for the first induction activities survey and 48 percent for the second, respectively). One reason for 
the variation in completion dates is the variation in the start and end dates for the academic calendars among the 17 
districts included in the Study. 
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4. Response Rates to Teacher Surveys 


Response rates on teacher surveys ranged from 88 to 97 percent for the treatment group and 78 
to 92 percent for the control group (Table I1I.1). Table 111.2 shows response rates for different 
subgroups. Despite overall response rates above 80 percent, the control group response rates 
persistently fell below those of the treatment group by a margin that was statistically significant. The 
degree to which the differential rates bias the findings depends on overall levels of nonresponse and 
the nature of nonresponse. Differences between the sample of respondents to the background 
survey and the full set of respondents and nonrespondents on observable school characteristics — 
the only data available for respondents and nonrespondents — arc not statistically significant (see 
Table 1 1 1.3). This suggests that the sample of teachers tor whom we have survey data is similar to 
the population of eligible respondents that the sample represents. 


Table 111.1. Response Rates to Teacher Surveys by Treatment Status 



Number of 
Eligible 
Respondents 

Response Rate (Percentages) 

Data Collection Instrument 

Full Sample 

Treatment 

Control 

Mentor Background Survey 

44 

100.0 

100.0 

n/a 

Teacher Background Survey’ 

1,009 

94.4 

96.6 

92.2 

Induction Activities Survey 





Fall 2005* 

1,009 

89.0 

93.3 

84.7 

Spring 2006* 

1,009 

87.7 

92.5 

82.9 

Fall 2006" 

1,009 

88.7 

91.5 

85.9 

Spring 2007* 

447* 

83.2 

87.9 

78.2 

Fall 2007* 

1,009 

85.3 

90.2 

80.2 

Fall 2008’ 

1,009 

8S.0 

89.6 

80.2 

Teacher Mobility Survey 

Fall 2006" 

1,009 

88.7 

91.5 

85.9 

Fall 2007" 

1,009 

85.3 

90.2 

80.2 

Fall 2008’ 

1,009 

8S.0 

89.6 

80.2 


Source: Mathematics teacher induction survey management system. 

Note: The Induction Activities Survey and Teacher Mobility Survey were administered together in fall 2006. fall 

2007. and fall 2008. 

-The spring 2007 survey was administered only in the seven districts that received two years of 
comprehensive teacher induction. 

"Response rates significantly different between treatment and control at the .05 level, 
n/a = not applicable. 
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Table 111.2. Response Rales to Teacher Surveys by Subgroup and Treatment Status 


Response Rate (Percentages) 












Induction 





Teacher 


Induction 

Induction 

Induction 


Induction 

Activities/Mobilit 

Induction 



Background Survey. 

Activities. 

Activities. 

Activities/Mobility 

Activities. Spring 

y Survey. Fall 

Activities/Mobility 


Fall 2005 


Fall 200S 

Spring 2006 

Survey. Fall 2006 

2007 


2007 


Survey. Fall 2008 


T 

C 

T 

C 

T 

C 

T 

C 

T 

C 

T 

C 

T 

C 

District Type 















(Years of Implementation) 

One year 97.1 

927 

94.2 

85.7 

93.5 

843 

92.7 872 

n'a 

n/a 89.5 

82.9 

86.9 

80.4 

Two year 

96.1 

91.7 

92 2 

83.4 

91.3 

81.1 

88.8 

62.0 

879 

77.9 90.0 

75.6 

918 

788 

Grade Level 















K or Pre-K 

963 

972 

96.0 

90.3 

92.5 

903 

94.7 

91.3 

932 

882 913 

80.6 

900 

792 

1 

986 

97.2 

96.9 

94.4 

96.9 

87.3 

96.4 

69.7 

839 

81.5 

*5.9 

88.7 

918 

81.7 

2 

97.6 

91.0 

96.2 

782 

89.3 

76.9 

91.0 

69.0 

92.1 

82.9 

69.3 

76.9 

90 5 

782 

3 

975 

94.7 

96.1 

86.0 

96.3 

80.7 

89.7 

64.3 

912 

778 

>6.4 

64.2 

87.7 

87.7 

4 

96.7 

91.7 

96.0 

883 

93.3 

86.7 

91.1 

64.5 

850 

78.3 

65.0 

73.3 

85.0 

767 

S 

100.0 

962 

95.7 

885 

97.8 

90.4 

93.0 

91.1 

833 

824 

>1.3 

64.6 

891 

885 

Other/ 

91.5 

84 1 

82.9 

752 

85.4 

752 

83.5 

72.9 

82.5 

678 89.0 

74.3 

890 

73.5 

multiple 















School Type (Percent in 

Free Lunch Program) 

0-49.9% 1000 

93.1 

94.6 

72.4 

94.6 

72.4 

94.6 

662 

100.0 

667 

11.9 

89.7 

100.0 

828 

SO- 74.9% 

959 

91.4 

92.9 

84 4 

91.8 

81.3 

902 

65.6 

909 

778 

698 

80.5 

87.8 

773 

75-100% 

97.1 

92.1 

94.1 

88.8 

92.4 

84.9 

91.8 

65.2 

860 

782 

69.7 

78.9 

886 

81.7 

Unknown 

90.0 

966 

83.3 

75.9 

93.3 

79.3 

79.3 

79.3 

893 

793 

>6.7 

75.9 

86 7 

65.5 


Source: Mathcmatica teacher induction survey management system: Mathcmatica Teacher Backg'ojrd Survey administered in fa I 2006. Mahemabca Firs*.. Second. Thro. Filth. and Si>Ih 

InduCTon Acuities Surveys administered in fall 2COS. sonng 2006. fal 2008. fall 2007. and fai 2008. and Mathematoa Fits!. Second, and Hard Teacher Mobile, 1 Surveys 
adminstered in fall 2006. fal 2007. and fall 2008 to ai study teachers: MathemaKaa Fourth InduCTon ACT .Hies Survey admnistered n spring 2007 » study teachers in two-year 
districts. 

Note: The Malhematica Induction Activities Survey 1 and Teacher Mctxlily Survey were administered tagelber in fall 2006. fall 2007. and fai 2008. Treatment and centre* grace sample slats 

are Blown n Appendix Table A.6. 


T = Treatment C = Control: n'a - not apptcable 
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Table III. 3. School Characteristics of Respondents and Nonrespondents 



Background 

Survey 

(n=953) 

Respondents Only 

Induction 

Activities 

Surveys 

<n-964) 

Mobility 

Surveys 

(n=922> 

Respondents 

and 

Nonrespondents 
<n= 1,009) 

Percent Free Lunch in School 

Unknown 

5.8 

5.6 

5.3 

5.9 

0-49.9% 

6.7 

6.6 

6.9 

6.5 

SO- 74.9% 

22.1 

22.3 

22.2 

22.4 

75-100% 

65.4 

65.5 

65.5 

65.2 

Percent White in School 

Unknown 

0.9 

0.9 

1.0 

0.9 

0-49.9% 

81.1 

81.0 

80.6 

81.4 

SO- 74.9% 

16.7 

16.5 

16.8 

16.3 

75-100% 

1.6 

1.6 

1.6 

1.5 

Percent Black in School 

Unknown 

0.9 

0.9 

1.0 

0.9 

0-49.9% 

59.3 

60.0 

59.8 

59.8 

SO- 74.9% 

6.9 

6.9 

7.3 

6.8 

75-100% 

32.8 

32.3 

32.0 

32.5 


Source: Mathematica analysis using the Common Core of Data 2004-2005 from the National 

Center for Education Statistics. 

Note: None of the differences between respondents and the full sample (respondents and non 

respondents) are statistically significant at the 0 . 0 S level. 


C. Classroom Observations 

Wc observed classrooms of teachers in the treatment and control groups to measure their 
classroom practices in the areas of reading and literacy. Wc excluded from this data collection any 
teachers who were responsible for small classes such as special education resource teachers, taught 
special populations such as bilingual classes, taught mathematics only, were not first-year teachers, or 
were no longer teaching in the district. Thus, the eligible sample for the classroom observations 
(698) was smaller than the full study sample (1,009). Among eligible teachers, wc achieved a 
response rate of 94 percent for the treatment group and 89 percent for the control group (Table 
111.4). 

Wc applied the eligibility rules uniformly to both the treatment and control groups. Some 
teachers with prior experience were in the study because their districts insisted, per their normal 
practice, that induction be offered to teachers who were new to the district. Because school districts 
chose to provide comprehensive induction services to these individuals, it was important to 
understand the impact of such services on their subsequent mobility behavior. 1 lowcvcr, wc 
excluded such teachers from the classroom practices analysis to focus on the true novice teachers, 
those for whom induction was most likely to have an impact on classroom practices. Wc classified 
those who had left the classroom as ineligible for observation instead of “missing" because wc 


'' We chose to focus on reading and literacy given the central role of this Subject in elementary education. 
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already planned a separate, detailed analysis to deal with attrition from teaching (the teacher 
retention/ mobility analysis). 


Table 111.4. Response Status to Classroom Observation and Reasons (or Nonresponse 


Status/Reason 

Number of 
Teachers 

Percentage 
of Teachers 

Percent of Eligible Teachers 

All 

Eligibles Treatment Control 

Eligibles 

Completes 

639 

63.3 

91.6 

93.8 

89.1 

Refusals 

59 

5.6 

8.5 

6.2 

10.9 

Ineligibles 

Does not teach reading in 

175 

17.3 

n/a 

n/a 

n/a 

classroom setting 

Not teaching 

64 

6.3 

n/a 

n/a 

n/a 

Not beginning 

72 

7.1 

n/a 

n/a 

n/a 

teachers/other 






Total 

1,009 

100.0 

100.0 

100.0 

100.0 


Source: Malhematca teacher induction survey management system, 

rv'a = not applicable 

The observations focused on pedagogical practices and classroom management. All ol the 
classroom observers were or had been classroom teachers themselves and underwent special training 
for this study. They visited the study classrooms in late spring 2006 (toward the end of the year), 
when differences in teacher practices resulting trom the comprehensive induction program would 
most likely be evident. They were blind to the treatment status of the classrooms they observed. 

The instrument used to conduct the observations was the Diagnostic Classroom Observation 
(DCO), formerly known as the Vermont Classroom Observation Tool (VCOT). This classroom 
observation tool and the methods used to train observers arc described in greater detail in Appendix 
A. We considered many alternative measures of classroom practices but selected the DCO for 
several reasons. First and foremost, the tool incorporates the most appropriate level of detail on 
practices that are believed to be part of good instruction. Although some of the alternatives lent 
themselves to consistent and easy measurement, they tended to focus on activities that could be 
counted, such as the number of times students raise their hands. In addition, they did not capture 
complex teacher behaviors, such as whether the teacher makes connections between reading and 
writing. The DCO measures the teacher practices that current research suggests are essential to good 
teaching or that have been linked to student achievement grown h (Cawclti 2004). Second, the DCO 
measures instructional practices that closely reflect those recognized by both the KTS and NTC 
induction programs, particularly literacy instruction. Third, the DCO is simple to complete while in 
the field. Finally, the DCO is an attractive choice because its developers pair the instrument and 
written materials with thorough training. 1 


I " Inter-rater reliability indices from the publisher are not available. In the current study, observers Mere deemed 
certified to conduct observations based on a comparison of their 16-item scores to the observations of a “gold standard” 
panel; following certification, however, inter rater reliability was not measured in the field. 
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We observed study teachers once while they were teaching a literacy unit. The observations 
lasted between one and two hours, with duration dependent on how the district or school structured 
its class periods. To reduce some of the variability that can occur with literacy classes, trained 
schedulers asked schools to invite observers into the school during the time when teachers were 
most likely to teach reading. More detail on the observation procedures can be found in 
Appendix A. 

Observers scored teachers in each of three constructs based on a set of items that arc believed 
to be indicators of good practice: implementation of a lesson, content of a lesson, and classroom 
culture. The three domains arc composed of five, four, and seven items, respectively. Observers 
rated the extent of evidence of teacher behavior for each item on a five-point scale, showing (1) no 
evidence, (2) limited evidence, (3) moderate evidence, (4) consistent evidence, or (5) extensive 
evidence. For example, lor lesson implementation: "The pace of the lesson is appropriate for the 
developmental level of the students”; for literacy content: “Understanding of content and concepts 
is taught through close nailing of tix! and vocabulary instruction"; and for classroom culture: 
“Classroom management maximizes learning opportunities.” ITie tool provides observers with 
examples of specific behaviors to look for in assessing the extent of evidence of teacher practice 
within each item. We found all items within each of the three literacy constructs to be highly 
correlated with other items in the construct, based on standardized inter-item reliability coefficients. 
Psychometric details arc presented in Appendix A. 

D. Student Records 

To gauge how comprehensive induedon affected student achievement, we collected student 
data dirccdy Irom school districts. The data include student scores, linked to teachers, on 
standardized tests administered in the spring of each study year (the posttest) and scores for the 
same students Irom tests taken in the spring of the prior year (the pretest). For example, in the third 
year of the study, districts provided pretest scores from spring 2007 and posttest scores from spring 
2008." Districts also provided student background data, including race/ethnicitv, eligibility for free 
or reduced-price meals under the National School Lunch Program, English language learner status, 
disability status, and date of birth (to determine which students were over age for grade). 

As shown in Figures 111.2 and 111.3, some teachers were not eligible for the test score analysis 
because their districts did not provide pretest and posttest data. State assessment systems under No 
Child Left Behind typically test students beginning in grade 3, which implies that only teachers in 
grades 4 and 5 in K— 5 elementary schools routinely have students with both posttest anti pretest 
scores. Across one-year and two-year districts and treatment and control groups, of the 
1,009 teachers who began in the study in the 2005-2006 school year, districts provided student test 
score data for 190 teachers in the most recent year of the study, the 2007-2008 school year. The 
other 819 teachers were either no longer teaching in the district, teaching in non-tested grades or 


11 For three districts that tested at least some students in the fall, we used a fall test as a pretest (at the beginning of 
the year in the Study teacher’s classroom) and/or a fall test as a posttest (at the beginning of the next year following 
enrollment in the study teacher’s classroom). 
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subjects, or teaching in one of two districts that did not 
teachers’ third year of teaching. 1 ' 

Ot the 190 teachers for whom we received student test score data, we excluded 27 teachers 
from the student achievement analysis. A total of 7 teachers were linked to an implausibly high 
number of students to be a regular classroom teacher, as explained in Appendix A. We believe that 
these teachers may have hecn mistakenly linked to students. Another 19 teachers were teaching in 
grade levels for which a treatment-control comparison could not be made within their district. \Xc 
deleted one teacher with anomalous test score gains. 1. The requirement that each grade within a 
district have both treatment and control teachers is to ensure that any peculiar test characteristics, if 
they exist, arc represented in both the treatment and control groups. 1 1 

The result of these eligibility restrictions is that the analysis sample for conducting test score 
analysis is smaller than the analysis sample for other outcomes, as anticipated in the design report 
((ilazcrman ct al. 2005). Teachers in the student achievement analysis sample in year 3 represented 
82 percent of eligible teachers in reading and 84 percent in math. The resulting standard errors of 
test score impact estimates were in the range of 0.036 to 0.064, meaning that an impact in effect size 
units of 0.071 to 0.126 would be statistically significant. Although the eligibility restrictions for the 
test score analysis result in a sample that has a different mix of districts and grades than the larger 
sample used to analyze teacher retention and other outcomes, and the reasons that some teachers or 
students arc excluded from the analysis sample may be related to test score outcomes, none of these 
reasons is likely to be related to treatment status. Therefore, we conclude that the estimated impacts 
on test scores are internally valid. Nonetheless, readers should exercise caution in generalizing the 
findings because the grades and districts arc not a random subset of the lull sample. 

We made treatment-control comparisons within grades within districts, and then aggregated 
across grades and districts. Scores were scaled scores, normal curve equivalents, percent correct, or 
percentile rankings. \\ ithin each district-grade combination, we rescaled tests by subtracting the 
mean score of all students who took that test and dividing by the standard deviation for all test 
takers. Typically, we used means and standard deviations from a state reference group or a grade- 
representative norm sample. Further details on aggregation arc presented in Appendix A. 


provide usable 


test score 


data for the 


11 All districts provided n-st scon data for the teachers’ first two years of teaching, as detailed in Glazerman et al. 
(2(X8) and Isenberg et al. (2009), but one of these districts was unable to link teachers to students. We did not collect 
data front this district for the teachers’ thinl year. Another district refused to provide data for the teachers’ third year. 

11 We deleted teachets whose average Student gain scores were greater than 1.5 Standard deviations above the mean 
for the reference gn>up (state or noun sample). This resulted in the loss of one classroom, whose students had gains in 
reading scores that were below the state average, in line with other classrooms in the study, but gains in math scores that 
would have placed most of them in the 94th percentile or above for the state. 

11 Counts of teachers w ith valid data pertain to math scores only, for illustration. The corresponding sample sizes 
for the reading analysis are shown along with those for the math analysis in Figures II 1.2 and 1 11-5. 
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Figure 111.2. Flow of Teachers Through the Study in One-Year Districts 
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Figure 111.3. Flow of Teachers Through the Study in Two-Year Districts 
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E. Other Supporting Data 

To interpret the impact findings, we needed to understand how comprehensive teacher 
induction programs were delivered and how they compared to the existing array of services. The 
induction activities surveys were the primary data source, but we gathered supplemental data to 
enrich the analysis. West I id staff reviewed materials supplied by the two comprehensive induction 
program providers (liTS and NTCi) to supplement the information we collected through the 
induction activities surveys. The materials, which provided the basis for the detailed description of 
program support (see Chapter IV), included documents such as training agenda and materials, 
curriculum guides, and assessment tools. 
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IV. PROGRAM IMPLEMENTATION AND INDUCTION SERVICE CONTRAST 

To characterize the nature of comprehensive teacher induction and the level of sendees 
provided to beginning teachers in the control condition, \vc measured the types, frequency, and 
duration of induction activities in both the treatment and control groups from the perspective of the 
teachers. For the treatment group, we collected additional data on teacher attendance at program 
events and mentor background characteristics and experience. 

This chapter has two parts. The first part describes the intervention provided to the treatment 
group during the 2005-2006 and 2006-2007 school years. During the 2005-2006 school year, 
sendees were provided in all 17 study districts. In 2006—2007, sendees continued in 7 of the 
17 districts. 

The second part of the chapter compares the induction experiences of teachers in the treatment 
group with the experiences of those in the control group, both during and after implementation of 
the comprehensive induction sendees in the treatment schools, in both one-year and two-year 
districts. The gap in sendees, or sendee contrast, represents the effect of offering treatment on the 
type and intensity of induction services received. According to our model of induction sendees 
(Figure 1.1), the service contrast should be an important precursor to impacts on desirable outcomes 
such as student test scores and teacher retention. 

A. Comprehensive Teacher Induction 

To test the hypothesis that a comprehensive teacher induction program would be more 
effective than the services normally provided to beginning teachers by their schools and districts, we 
had to identify' such a program as well as a provider of program sendees. Accordingly, we issued a 
Request for Proposals (RF'Pj in 2004. The RFP specified that the induction program should include 
components that earlier research and professional wisdom gleaned from practice had suggested were 
important features of successful teacher induction programs (Alliance for Excellent Education 2004; 
Ingersoll and Smith 2004; Smith and Ingcrsoll 2IM14; Kelly 2004; Serpell and Bozeman 2000). The 
components include carefully selected and trained full-time mentors; a curriculum of intensive and 
structured support for beginning teachers, including orientation, professional development 
opportunities, and weekly meetings with mentors; a focus on instruction, with opportunities for 
novice teachers to observe experienced teachers; formative assessment tools that permit evaluation 
of practice on an ongoing basis and require observations and constructive feedback; and outreach to 
district- and school-based administrators to educate them about program goals and to garner their 
systemic support for the program. 

A group of outside expert reviewers read and scored the proposals received in response to the 
R1 ; P. Among those submitted, the ETS and NTC proposals stood out as most closely meeting the 
study’s specified requirements. We selected these programs in order to determine whether the 
comprehensive induction model is effective in improving classroom practices, student achievement, 
and teacher retention, rather than whether a particular comprehensive induction program is effective 
in improving these outcomes. Including two programs increased our ability to generalize findings of 
the comprehensive induction model relative to including just one program. Furthermore, the expert 
panel that was convened to select the study’s intervention rated both the I : .TS and NTC! programs as 
high in quality, and the panel agreed that they were similar enough in goals and structure that 
including both (and pooling impact data across the two programs) would be a fair test of the 
comprehensive induction model. 
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The detailed description of the two programs in the following sections is based on information 
from program documents and data Irom West lid's external monitoring of the induction programs' 
implementation in all districts during 21X15— 2006 and in the seven districts implementing a second 
year of induction during 2006—2007. In the first year, WestEd monitors observed all mentor training 
sessions and webinars (web-based seminars provided by ETS) conducted by the programs and 
reviewed materials for each event in advance. Monitors interviewed program leaders and stall and 
received reports from them regularly, weekly at start-up and monthly later in the school year. For 
each program, the monitors also observed one initial local orientation for beginning teachers, one 
for administrators, and an end-of-year colloquium for beginning teachers. 

WestHd monitors visited each district in the fall; in the spring, they either visited again or 
conducted semi-structured telephone interviews. 1 ' Monitors also conducted end-of-year visits, 
observed a professional development and/or study group session for beginning teachers, observed 
one weekly mentor meeting, and joined at least one mentor during regular weekly visits with two to 
four of his or her beginning teachers. During visits and telephone calls, monitors spoke separately 
with the district coordinator and each mentor to gauge whether districts were receiving all prescribed 
sendees from the induction programs; whether the nature and level of effort in districts’ 
implementation were consonant with the programs’ intent; whether district coordinators were 
enabling mentors to fulfill their roles, and whether mentors were carrying out their roles as planned; 
what local challenges were impeding implementation, if any; and what plans districts and programs 
had for addressing such challenges. 

In the second year of implementation in the seven two-year districts, WestEd reviewed 
materials and attendance data for each major professional development event and conducted 
interviews and received reports on a schedule similar to that of the first year. WestEd monitors also 
made two- or three-day site visits in the first months of the school year to two of the three NTC 
districts anti three of the four ETS districts. During these visits, monitors interviewed district 
coordinators and mentors and observed professional development events for beginning teachers. 
Monitors also conducted semi-structured telephone interviews with all district coordinators at the 
beginning and end of the school year. All but two districts were followed by the same WestEd 
monitor as in year !. In these two exceptions, circumstances made it necessary to assign different 
WestEd monitors, hut they had had hill monitoring experience with other districts during year 1. 

Practitioners and policymakers should he aware that the programs implemented in this study hv 
IiTS and NTC were not necessarily the same models that would he delivered outside the study 
context. First, for study purposes, the objective was consistent implementation of each program, 
with a high level of fidelity to program design and a quick response to any implementation issues. 
Second, the providers adapted their programs to ensure that the required components were included 
in a one-vear curriculum to reflect the initial study design. Once it was deckled to add a second year, 
the programs made additional modifications and adaptations to extend the curriculum another year. 
Finally, the providers adjusted their usual methods of service delivery to meet the requirements of 
the study in both years. To implement the mentor training, each program organized off-site mentor 
training sessions, bringing together the mentors Irom all of the districts in which they were 
operating, as described later. Outside the study context when there is district-wide implementation 

14 Four of the nine I2TS districts (44 percent) and three of the eight NTC districts (38 percent) were visited. The 
others were interviewed by telephone. 
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with a larger number of mentors, training typically occurs within the district, rather than off site with 
mentors from other districts. 

1. Administrative Support Structure 

To understand the treatment prov ided by each program, we begin with an overview of the key 
roles played by designated staff members in implementing the programs (Figure IV. 1). Oversight for 
implementation of the KTS and NTC programs was the responsibility of a designated stall member 
from the respective organizations. These program leaders directed all activities and provided 
substantive leadership. They led the adaptation of program materials tor use in the study, played 
integral roles in the design and delivery of mentor trainings, and supported the work of their own 
program staff and site-based district coordinators. They held monthly staff meetings and stayed in 
close contact with district coordinators for purposes such as preparing or debriefing the weekly 
mentor meetings, providing ideas for optimizing mentors’ working conditions, monitoring the 
fidelity of district implementation of induction program content and activities, and fostering 
productive relationships among various staff members. In year 2, an FITS co-leader left the study 
and was replaced by one of the mentors, whereas the NTC leader continued in her role. 1 

In collaboration with the program leaders, designated KTS and NTC program staff worked 
with assigned districts to help implement the program consistently across the districts. 1 ’ In the 
second year, in the seven districts that continued implementation, all program stall had experience 
in this role from the previous year. Three districts were served by the same person as in year 1; two 
KTS and two NTC districts were served by a different person in the second year. The program staff 
made monthly visits to each district, during which they delivered or facilitated a professional 
development session for beginning teachers, worked with district coordinators on issues related to 
pregram implementation, met with the mentors to continue building their skills, and shadowed them 
on their weekly visits with beginning teachers. While shadowing the mentors, program staff could 
observe firsthand any needs for program support as related to mentoring skills or the use of 
pregram processes and tools. This provided staff with the opportunity to discuss how the program 
could best address the needs and circumstances of teachers in each setting. Between visits, program 
staff engaged in regular and frequent communication with mentors and district coordinators to 
discuss any issues that surfaced and to provide ongoing direction. 


“ In addition, W’estF-d staff provided external oversight of services provided in order to help address any issues 
that anise and to keep implementation consistent across all sites. 

r Tile ETS co leader for the study, who hail served under the program leader in y ear 1, left because of personal 
circumstances. A mentor from year I was promoted to serve as co-leader in year 2, and this person also continued to 
serve as program staff for a district. Whereas the NTC. leader continued in this role, this person also served as program 
staff for one of the districts in year 2. 

" Kach program Staff member Served one or two districts. Staff members spent between 20 percent anil .Ml percent 
of their time serving each district. 
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Figure IV.1. Structure of Roles in the Induction Program 


Induction School Districts 

Program (ETS or 
NTO 



Districts designated their own statt members to provide local oversight to program 
implementation. District coordinators worked in departments of human resources or professional 
development. In year 1, a key function was to help establish district positions for mentors and 
recruit candidates for these positions, establish procedures for job reporting and evaluation, create 
functional working conditions for mentors by locating office space, and setting up email and 
telephone access. They also helped identify - beginning teachers to participate in the study, assign 
teachers to mentors, find appropriate settings for program events and schedule them on the district’s 
master calendar, and address occasional program implementation challenges. In both years of 
program implementation, district coordinators facilitated mentors’ weekly meetings and joined 
mentors at off-site trainings throughout the year. To reduce the chances that treatment and control 
groups would share any services or resources, we asked districts to assign coordinators who would 
not also be involved in the district’s own induction activities at the elementary level. 

The individuals serving as district coordinators in year 1 continued in that role in year 2; in one 
district of each program, however, a replacement was named because the original person could not 
continue due to changes in her main position. The district coordinators worked with the programs at 
the outset of year 2 to adjust mentors' workloads depending on which beginning teachers stayed or 
left after year 1, arranged settings for program events, and scheduled them on the district’s master 
calendar. In both years, district coordinators spent 10 to 15 percent of their time on these functions. 
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with considerably more time early in the year and much less lime as the year progressed (about 30 
percent and less than 10 percent, respectively, in year 1, and about 20 percent and less than 10 
percent, respectively, in year 2 ). 

According to interviews with district coordinators by Wcstlul monitors, those with more 
influence in the district were better able to broker the organizational arrangements that needed to be 
made across district departments and levels. For example, coordinators had to obtain approval for 
scheduling professional development sessions on the district's master calendar and locate mentor 
offices or rooms to serve as meeting spaces. Factors that helped coordinators in their role included 
the support of high-level district administrators, coaching or mentoring experience, and gtxxl 
rapport with program staff. In contrast, smooth program implementation was more difficult when 
coordinators were less responsive or influential. Given that the coordinator role was an addition to a 
full set of existing responsibilities, coordinators struggled to can e out the time needed for program 
implementation. 

Principals also played an important role in program implementation. Both IvTS and NTC asked 
principals to encourage and support beginning teachers’ participation in induction activities, 
particularly by permitring them to attend professional development sessions and minimizing 
conflicts that could impede mentors’ efforts to schedule time with them. In both school years, the 
programs offered an initial orientation lor administrators, and NTC held a fall and spring 
administrator briefing over breakfast.' During these events, program leaders and district 
coordinators sought to gain administrators’ support for their beginning teachers’ participation in the 
induction program and for the involvement of the mentor assigned to their school. The orientation 
events provided brief overviews of beginning teachers’ needs for support and development and the 
induction program's purposes and activities. Both programs strongly cautioned mentors against 
sharing specific information with principals that could affect the beginning teachers' job evaluations 
and compromise confidentiality and openness in the mentor/ mentcc relationship. 

Overall, school and district officials evidenced wide variation in the level of principal support, 
ranging from those who were extremely supportive, actively encouraging teachers to make the most 
of the induction opportunities, to principals who actively resisted participation and would not permit 
teachers to be released for program activities." The resistant principals either required beginning 
teachers to attend school or district events that conflicted with induction program activities or 
imposed heavy restrictions on when mentors could visit teachers. During year 1, five principals out 
of the 210 treatment schools in the study fell into this latter category. Such resistance abated over 
the course of this year and the next in response to the intervention of district coordinators, mentors, 
and program staff. Induction programs encouraged mentors to visit their beginning teachers’ 
principals at least once a month. When program staff shadowed mentors, they also met briefly with 
principals who did not strongly support the induction program in order to help convince them of its 
value. 


w When ITTS ami NTC are contracted by a district to implement their respective programs, not in the context of a 
study, district coordinators spend more than 15 percent of their time on program implementation. 

31 In year 2, NTC facilitated mentors taking a presentation role for part of the event tit enhance principals’ 
perception of their ntles and expertise. 

Westlid’s monitors gathered this information through interviews with program leaders, district coordinators, and 
mentors, and through direct observations of participants at the NTC. administrator breakfast briefings. 
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2. Mentors 

At the heart of the comprehensive induction sendees was the support provided by a highly 
trained, full-time mentor. Mentors were most frequently responsible lor 12 beginning teachers 
(32 percent), although caseloads ranged from 8 to 14 teachers over the course of each year. With 
mentoring as the largest component of the comprehensive induction programs, mentors necessarily 
underwent careful selection and training. At the outset of the study, programs worked with each 
district by providing them with a written job posting, guidance on selection of the Interview Team, 
and a set of interview questions and rubric for mentor selection. The selection rubric called for 
individuals with a minimum of five years of teaching experience in elementary school, recognition as 
an exemplary teacher, and expertise in designing and implementing standards-based instruction. In 
each district, candidates who met these criteria were interviewed by a committee that included the 
district coordinator for the study and other participants, such as representatives from human 
resources, the teacher’s union, and professional development; an assistant superintendent for 
instruction; other experienced mentors; and/or school administrators. Stalf from the comprehensive 
induction programs traveled to the interviews or conducted telephone consultations with the district 
coordinators to help in the selection of mentors, though districts were responsible for the final 
decisions. In all but three districts, two or more people applied for each mentor position. One 
instance of turnover among mentors occurred during the first year of program implementation. 
Mentors involved in year 1 implementation continued to fill the mentor positions for year 2 of the 
study. Because some beginning teachers left teaching or the participating districts after year 1, 
mentor caseloads were adjusted at the beginning of year 2. Whenever possible, beginning teachers 
were served by the same mentor during years 1 and 2." 

Table IV. 1 describes the background of the 44 mentors selected to deliver the comprehensive 
induction sendees in the study districts. These data arc taken from a survey administered to mentors 
at the outset of program implementation in year 1 . All mentors reported at least 5 years of teaching 
experience, with an average of 17.9 years. Most (86 percent) held a master’s degree and 14 percent 
were certified through the National Board of Professional Teaching Standards. A majority 
(82 percent) had come to the mentoring role from a position as a classroom teacher, and 46 percent 
had ever worked in nonteaching positions in education. The average age of these mentors was 43 in 
2005 and 51 percent were white, non-F lispanic. Although the mentors were implementing the 
particular program under study for the first time during the 2005—2006 school year, 77 percent 
reported having prior mentoring experience — 6.2 years on average — and among those, 74 percent 
had previously attended mentor training. The most commonly reported areas of training addressed 
classroom management, giving effective feedback, and mentor roles (over 87 percent for each area). 


— Halfway through year 2, one NTC mentor left the study for a career advancement opportunity; the service loads 
of remaining mentors in this district were reconfigured to distribute responsibility for the beginning teachers previously 
assigned to the departing mentor. 
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Table IV.1. Mentor Characteristics 


Characteristics 

Percentage 


Race/Ethnicity: Percentage White, Non Hispanic 

51.2 


Education: Has Master's Degree 

86.4 


Certified Through National Board of Professional Teaching 

13.6 


Standards (NBPTS) 

Teaching Experience 

Last position before mentoring was as a classroom teacher 

81.8 


Ever worked in nonteaching position(s) within education 

45.5 


Mentoring Background 

Any mentoring experience 

77.3 


Any previous mentoring training (if have mentoring 

73.5 


experience) 

Areas of Mentor Training (If Received Mentor Training) 

Classroom management 

87.5 


Giving effective feedback 

87.5 


Mentor roles 

87.5 


Coaching strategies 

80.0 


Lesson planning 

79.2 


Classroom observations 

65.2 


Helping adult learners set goals 

52.2 


Analyzing student work 

50.0 


Leading study groups 

39.1 


Coaching in literacy/language 

27.5 


Coaching in math 

20.8 



Ave rage 

Range 
(Min., Max.) 

Age in 2005 (Years) 

43.0 

(28.61) 

Teaching Experience (Years) 

17.9 

(5. 35) 

Experience in Nonteaching Position(s) 

1.4 

(0, 6.8) 

Within Education (Years) 

Years of Mentoring Experience (If Have Mentoring Experience) 

6.2 

(1.30) 

Caseload (Number of Beginning Teachers) 

11.7 

(8. 14) 

Sample Size (Mentors) 
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Source: Mathematca Mentor Survey administered in (all 2005 to all study mentors. 


Once mentors were selected for program participation, both ETS and NTC trained their 
respective mentors during the first year of program implementation in four training sessions each 
that were extensive, intensive, and focused. Two of the eight trainings were fully attended. One 
mentor was absent at the six other trainings (a different person in each instance). These absences 
were due to personal circumstances. Iiach program brought mentors together lor a total of !0 or 
12 days (F.TS and NTC, respectively), devoting two to three days per session (Figure IV.2). By 
convening mentors Irom all of a program's study sites at a single location, trainings provided 
opportunities for cross-site collaboration designed to enrich learning the programs’ curricula and 
also to foster concrete discussions about how best to address any implementation issues. By holding 
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sessions over the course of the 2005—2006 school year, program staff were able to provide training 
as it was needed. Trainings previewed the content of upcoming professional development sessions 
and gradually introduced forms and processes of mentor/mentee work. For example, forms and 
processes for beginning teachers' midyear reflections on their instructional practices and 
professional development were not introduced to mentors until the second training (fall); ways for 
beginning teachers to analyze student work in the spring were introduced during the third training 
(winter); and the fourth training (spring) explored ways of prompting beginning teachers to initiate 
longcr-rangc goals for their development. 

Trainings focused on active learning in two main areas: (1) improving beginning teachers' 
instruction, including the use of forms and processes to advance it; and (2) mentoring skills for 
working with beginning teachers, such as using evidence from teachers' instruction rather than 
presenting opinions, and conversational techniques, such as paraphrasing and asking clarifying 
questions. Programs also spent some training time on how to address beginning teachers' survival 
needs and other more general needs, with IJTS spending 5 percent of mentors’ training time and 
NTC spending up to 10 percent of training time on this topic/' 


The programs were also intentionally designed to provide mentors with support and 
development opportunities throughout the academic year via activities beyond the four formal 
training sessions. The planned activities involved interaction with program staff, other mentors, and 
district coordinators. VTestHd's monitoring data indicate that when program staff visited their 
districts each month, they joined the weekly meeting to help mentors become more familiar with 
program content and tools. The weekly meetings also allowed mentors to exchange ideas on 
successes and challenges in working with beginning teachers and in gaining the support of building 
administrators. At the outset of the school year, district coordinators provided substantive advice 
during weekly mentor meetings and three-quarters of them continued to join mentor meetings 
throughout the year. Program staff and district coordinators regularly responded to telephone or 
email inquiries from mentors, and the ETS program held two one-hour webinars for mentors and 
district coordinators. The fall webinar helped mentors shift from providing the type of general 
support needed by beginning teachers at the outset of the year to focusing on specific development 
of teachers’ instructional practices. During the spring webinar, coordinators and mentors shared 
ideas for planning the end-of-year colloquium. (The NTC program did not include webinars but 
covered these topics during its additional two days of mentor training over the year.) 


21 Hxamples of survival anil more general needs include how to interact with the principal, how to deal with 
teachers’ emotional needs, him- to deal with a particularly difficult student, or how to find classroom resources. 
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Figure IV.2. Comprehensive Induction Program Training (or Mentors, District Coordinators, and 
Administrators: 2005-2006 School Year 



Start of 




End of 

May 

School 




School 

2005 

Year 

Month 3 

Month 5 

Month 7 

Year 


ETS 


NTC 



Note: Activities common to both providers are shown on both sides of the horizontal divider between ETS and 

NTC. The district orientation was offered to district coordinators and administrators from the central 
office. The administrator orientation was offered to school building administrators. 


The program leaders and program staff also reviewed and provided feedback on the logs used 
by mentors to summarize weekly meetings with teachers. Feedback included discussion about why a 
beginning teacher was requiring or receiving more or less contact time than average, ideas for 
addressing beginning teachers' needs, how to use program tools, and how to stay on schedule with 
program implementation. 


During the second year, ETS and NTC continued intensive training of their respective mentors 
in the seven districts that continued program implementation. Each program brought mentors 
together for a total of 8 and 10 days over three and four sessions (ETS and NTC, respectively), 
devoting 1.5 to 2.5 days per session (Figure IV.3). In addition to trainings, NTC held a late summer 
retreat with its mentors to debrief the first year of program implementation and help with the final 
strategic planning for the second year. At the outset of the 2006—2007 school year, IvTS held a two- 
hour webinar for initial orientation of its mentors, whereas NTC held an early training session. A 
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second ETS webinar was held between the first two ETS trainings. For a training later in the year, 
one of the districts hosted the training. 

Figure IV.3. Comprehensive Induction Program Training for Mentors. District Coordinators, and 
Administrators: 2006-2007 School Year 


ETS 


NTC 



Start of 




End of 

May 

School 




School 

2007 

Year 

Month 3 

Month 5 

Month 7 

Year 



Note: 


Activities common to both providers are shovm on both sides of the horizontal divider between ETS and 
NTC. The administrator orientation was offered to school building administrators. 


All mentors participated in the trainings, which reflected a locus similar to that of year 1. Given 
mentors’ experience from their training in the first year, activities during the second year included 
less emphasis on learning mentoring skills. Instead, NTC training paid particular attention to the 
equitable engagement of diverse students, and part of the spring training was spent having mentors 
shadow their peers during meetings with beginning teachers. For ETS, the training was expanded to 
include a focus on the content and conduct ol its Teacher Learning Communities, a new component 
of its professional development activities in year 2, described later in this chapter. 

Similar to the support described for year 1 of implementation, the programs were also 
intentionally designed to provide mentors with support and development opportunities throughout 
the academic year through activities beyond the four formal training sessions, using the same 
strategies described earlier for year 1 . 
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3. Program Services and Activities 

The comprehensive induction programs both provided structured mentoring following a 
curriculum designed to improve instructional practice, monthly professional development session, 
opportunities to observe veteran teachers, and an end-of-ycar colloquium. Below we describe the 
services and activities in the first and second years ot program implementation. 

a. Year 1 Program Services and Activities (2005-2006 School Year) 

In the first year ot program implementation, mentoring of beginning teachers began during the 
first week of school whenever possible, following an orientation session during which teachers were 
introduced to induction program goals and schedules. On average across the districts, halt of the 
mentors were able to visit their beginning teachers before the first day of school to get acquainted 
and help set up classrooms. ' Once the school year was underway, mentors tried to visit their 
beginning teachers at the same time even - week, but meetings were rearranged as needed to 
accommodate circumstances or to accomplish a specific task, such as observing a particular lesson." 

All beginning teachers in the treatment group were also expected to participate in monthly 
professional development (PI)) sessions, anti the MTS districts ottered monthly study groups — 
mentor-facilitated peer support meetings for beginning teachers. Beginning teachers also observed 
veteran teachers once or twice during the year. At the end of the school year, beginning teachers 
participated in a colloquium. Mach of these induction activities is described in more detail below. 

Mentoring. Both the MTS and NTC programs consist of a yearlong curriculum for beginning 
teachers that focuses on effective teaching (Table IV. 2). The MTS program defines effective teaching 
in terms of 22 critical components organized into four general domains of professional practice. The 
components are aligned with the Interstate New Teacher Assessment and Support Consortium 
(INTA.SC 1992) principles.'" The NTC induction model defines effective teaching in terms of six 
Professional Teaching Standards." Mach standard or domain is broken into a succession of more 
discretely defined categories of teaching behaviors. The mentor’s goal is to help beginning teachers 
use evidence from their own practice to recognize and implement effective instruction as defined by 
the domains or standards. Both induction programs use a continuum of performance as a means for 
teachers to establish a benchmark and improve their instructional practice fTable IV. 3). 


M The primary obstacle to holding these early meetings was the delay in district staff’s identifying the beginning 
teachers in each school for the study. This challenge was due to operating in a study context; districts may have been 
able to begin providing mentoring services more quickly in the absence of the study since they could have sent mentors 
out to schools in which principals could readily identify beginning teachers with whom they would work. Additionally, 
12 percent of beginning teachers were hired after the school year began, further contributing to delays in identifying 
teachers and assigning mentors. 

* Especially in the early part of the 2105-2006 school year, mentors spent extra time with beginning teachers who 
were experiencing serious survival or instructional challenges (data on the frequency aixi duration of these meetings arc 
unavailable). Program staff monitored these situations to ensure that such services did not take time away from focusing 
on instruction for those teachers who were on track in their development. 

The HTS program derives its content from lialnuiaag ln/riiioaa/ Prarlke: A Frmenork for Trarhing (Association for 
Supervision and Curriculum Development 19%). 

The content of the NTC program is based on two documents — California !r Standards for tlx TtacbiRg 1‘njrssnm 
(California Commission on Teacher Credennalmg 1997) anil Cuntimmm ofTtachtr DatJopmeat (New Teacher Center 20112). 
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The first-year curriculum of ETS is organized around seven Pathwisc Induction Events, each of 
which is designed to help beginning teachers explore a particular aspect of their practice and become 
increasingly proficient as an educator. The initial event requires teachers to investigate their school 
and community and to develop profiles of the students in their class. In two events, mentors 
observe beginning teachers in the classroom and provide feedback on their practices, planning 
materials, and students’ work. Three events involve a structured series of activities through which 
teachers explore a certain aspect of their practice as related to (1) establishing a positive classroom 
environment, (2) designing an instructional experience, and (3) analyzing students’ work. Teachers 
identify a particular practice in each of these areas, implement it, and then reflect on the experience. 
Each event concludes with the development of an Individual Growth Plan in that respective area. 
’Hie last event is a colloquium for all beginning teachers in a district during which they conduct a 
self-assessment. 

The centerpiece of the NTC mentoring model is the NTC Formative Assessment System 
(FAS). FAS involves a scries of collaborative processes between the mentor and the beginning 
teacher that aims to collect and analyze a variety of data focused on teacher practices and student 
learning. A set of protocols and forms helps structure mcntor/teachcr interactions, although an 
individual teacher’s needs determine the precise focus and pace. FAS’s central tool is a collaborative 
assessment log that provides the framework for the mentor's and beginning teacher’s weekly 
conversation. The teacher uses the log to record information on recent successes and challenges and 
specific next steps. FAS focuses on two key areas in a teacher’s development: (1) professional goal 
setting and (2) classroom practices. Professional goal setting involves both setting goals and 
reflecting on instructional practices in relation to the model’s six teaching standards (Table IV.2) and 
the continuum of performance (Table IV.3). Teachers identify an area of practice as a focus area, 
develop a plan to achieve particular goals, and then assess their progress. Teachers establish an 
individual learning plan and conduct a midyear review to assess progress in meeting goals. 
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Table IV.2. ETS and NTC Content: Four Domains and Sue Professional Teaching Standards 


ETS Domains of Professional Practice 

Domains 

Example. Subcategories of a 

Domain (Instruction) 

Example. Details of Subcategory 
( Engaging Students in Learning ) 

1. Planning and preparation 

2. Classroom environment 

3. Instruction’ 

4. Professional responsibilities 

* See next column lor details 

Communicating clearly and 
accurately 

Using questioning and discussion 
techniques 

Engaging students in learning’ 

Providing feedback to students 
Demonstrating flexibility and 
responsiveness 

Representation of content 

Activities and assignments 

Grouping of students 

Instructional materials and resources 
Structure and pacing 


"See next column lor details 


NTC Professional Teaching Standards 

Professional Teaching Standards 

Example. Subcategories of a 

Standard (Engaging/Supporting AJJ 
Students in Learning ) 

Example. Details of Subcategory 
(Promoting Sell-Directed. Relectve 
Learning lor All Students) 

1. Planning instruction and 
designing learning experiences 

2. Creating' maintaining effective 
environments 

3. Understandingi'organizing 
subject matter 

4. Development as a professional 
educator 

5. Engaging/supporting all 
students in learning" 

6. Assessing student learning 

'See next column lor details 

Connecting prior knowledge, life 
expehences. and interests with 
learning goals 

Promoting sell-directed , relective 
learning lor all students' 

Using variety of instructional 
strategies and resources to respond 
to students' diverse needs 

Facilitating learning experiences that 
promote autonomy, interaction, and 
choice 

Engaging students in problem 
solving and critical thinking to make 
subject matter meaningful 

’See next column for details 

ktotvate students to initiate their own 
learning and strive for challenging 
goals 

Describe their learning processes 
and progress 

Explain dear learning goals for 
students 

Engage students in examining their 
work and work ol peers 

Help students develop and use 
strategies for knowing, reflecting on. 
and monitoring their learning 

Help students use strategies for 
accessing knowledge and 
information 

(Note: Above entries are slightly 
abbreviated versions ol the source 
document.) 


Source: The ETS program derives its content from Enhancing Professional Practice: A Framerrork lor Teaching 

(Danielson 1996). The content of the NTC program is based on two documents— Calilomia s Standards 
lor the Teaching Profession (California Commission on Teacher Credentialing 1997) and Continuum ol 
Teacher Development (New Teacher Center 2002). 
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Table IV.3. Example of ETS and NTC Detailed Specifications for Development of Beginning Teachers' 
Practices 


ETS: Domain 3 (Instruction): Engaging Students in Learning: Representation of Content 


Level 1: 

Level 2: 

Level 3: 

Level 4: 

Unsatisfactory 

Basic 

Proficient 

Distinguished 

Representation of 

Representation of 

Representation of 

Representation of 

content is 

content is inconsistent 

content is appropriate 

content is appropriate 

inappropriate and 

in quality; some 

and links well with 

and links well with 

unclear or uses poor 

portions are done 

students’ knowledge 

students’ knowledge 

examples and 

skillfully, with 

and experience. 

and experiences. 

analogies. 

examples, while others 
are difficult to follow. 


Students contribute to 
representation of 
content. 


NTC: Standard 5 (Engaging/Supporting All Students in Learning): Promoting Self-Directed, Reflective 

Learning for All Students 


Level 1: 

Level 2: 

Level 3: 

Level 4: 

Level S: 

Beginning 

Emerging 

Applying 

Integrating 

Innovating 

Directs student 

Provides some 

Supports 

Structures 

Facilitates 

learning 

opportunities for 

students in 

learning activities 

students to 

experiences and 

students to 

developing skills 

that enable 

initiate learning 

monitors 

monitor their own 

needed to 

students to set 

goals and set 

students' 

work and to 

monitor their own 

goals and develop 

criteria for 

progress within a 

reflect on 

learning. Students 

strategies for 

demonstrating 

specific lesson. 

progress and 

have 

demonstrating, 

and evaluating 

Assistance is 
provided as 
requested by 
students. 

process. 

opportunities to 
reflect on and 
discuss progress 
and process. 

monitoring, and 
reflecting on 
progress and 
process. 

work. Students 
reflect on 
progress/process 
as a regular part 
of learning 
experiences. 


Source: The ETS program derives its content from Enhancing Professional Practice: A Framenork tor Teaching 

(Danielson 1996). The content of the NTC program is based on two documents — California s Standards 
for the Teaching Profession (California Commission on Teacher Credentialing 1997) and Continuum of 
Teacher Development (New Teacher Center 2002). 


Classroom practice focuses on students' learning needs and teachers' instruction. Various FAS 
tools help mentors and teachers collaboratively develop an understanding of school and community 
resources as well as student profiles. Additional tools focus on analyzing students’ work to permit 
development of a better understanding of learning needs anti how to address them, communicating 
effectively with parents, and planning lessons. Several tools help the mentor collect data from 
regular classroom observations of the teacher. 

To cover the IvTS and NTC! program curricula, programs expected mentors to allocate 
approximately two hours for contact time each week with every beginning teacher in their 
caseload/' Mentors were expected to spend sonic of that time every week meeting with beginning 


a Average actual time spent with a mentor in one year anil two-year districts is shown in Tables V.3 and VIA, 
respectively. However, these data do not distinguish between time spent with a treatment mentor and time spent with 
other mentors. 
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teachers for one-on-one conversation, particularly around the induction programs’ teacher learning 
activities. For the balance of the weekly allotment of time, mentors exercised professional judgment 
in using a range of strategies for assisting beginning teachers with induction program activities or 
general beginning teacher needs, for example, observing instruction, reviewing lesson plans and 
instructional materials, providing a demonstration lesson, reviewing student work, or interacting 
with students to assist teachers in understanding their students’ learning challenges. 

Monthly Professional Development Sessions.' During the 2005—2006 school year, both 
MTS and N FC held monthly, two-hour professional development sessions (Figure IV.4), which 
complemented the interactions between mentors and beginning teachers as described in the seven 
I : .TS events and NTC’s FAS f Table IV.4). On average, the professional development sessions drew 
72 percent and 65 percent of the UTS and NTC beginning teachers, respectively, as shown in Tables 
IV. 5 and 1V.6. I lowcver, average attendance ranged from almost universal attendance in one district 
(93 percent) to less than half in another (43 percent). 

Study Groups. In the MTS program, the mentors anti beginning teachers met monthly in 
informal study groups. This gave teachers an opportunity to discuss with their mentors how they 
were progressing in their practice, challenges they faced, and approaches for addressing the 
challenges. The meetings also enabled teachers to exchange ideas and information related to their 
teaching practices. The average attendance at MTS monthly study groups was 69 percent, ranging 
across districts from 63 to 84 percent. 

Observation of Veteran Teachers. Mentors arranged one or two formal opportunities for 
beginning teachers to observe experienced teachers, with an attempt to select observations that 
would be relevant to the instructional goals of the beginning teachers. 'They provided advance 
guidance to beginning teachers on what to observe, as well as methods and forms for attending to 
the focal instructional practices and recording observations of them. Mentors debriefed the 
observations with beginning teachers to discuss what they learned. 1 


In five districts, unexpected scheduling conflicts in the master calendar or other district factors (for example, 
temporary Labor disputes) resulted in cancellation of one professional development session with no opportunity to 
reschedule. 

* The first NTC session was a full day. 

" To limit the time burden on teachers, no professional development session was held in the moiith(s) when the 
observations were conducted. Programs encouraged mentors to accompany beginning teachers for the observations, hut 
il was challenging for mentors to accomplish this while maintaining their regular weekly travel to multiple schools for a 
meeting with even beginning teacher in their caseload. Data on the percentage of treatment teachers who observed 
veteran tc'achers together with their mentors ami who discussed the observations with mentors duting debriefings are 
unavailable. 
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Figure IV.4. Comprehensive Induction Program Activities (or Beginning Teachers: 2005-2006 School Year 




Mentors visit beginning teachers, weekly throughout the year (ETS and NTC) 


Notes: BT = beginning teacher; PD = professional development. Activities common to both providers are 

shown on both sides of the horizontal divider between ETS and NTC. 
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Table IV.4. Topics for Monthly Professional Development Sessions, by Program 

ETS NTC 


Communication with families 
Classroom management 

Differentiated instruction for ELL and special 
needs students 

Evidence centered teaching and assessment 

Analyzing and sharing student work 

Examining evidence of professional growth by 
sharing work from induction program activities 

Beginning teacher self assessment and sharing 
of learning (colloquium) 

Source: 


Effective learning environment (the only full day 
session) 

Engaging all students 
Assessing all students 

Planning instruction 

Understanding and organizing subject matter 

Developing as a professional educator 
(colloquium) 


The ETS program derives its content from Enhancing Professional Practice: a Framework tor Teaching 
(Danielson 1996). The content of the NTC program is based on two documents — California s Standards 
for the Teaching Profession (California Commission on Teacher Credenbaling 1997). Continuum of 
Teacher Development (New Teacher Center 2002). and other unpublished materials provided to the 
study authors by program staff. 


Table IV.5. Teacher Attendance at ETS Induction Activities (Percentages): 2005-2006 School Year 


Range of Average 
Attendance Across 

Districts Regularity of Attendance 


Activity 

Average 
Attendance 
of BTs 

High 

Low 

Teachers 
Missing No 
More Than 1 
Session 

Teachers 
Missing 3 or 
More Sessions 

Orientation’ 

n/a 

n/a 

n/a 

n/a 

n/a 

Monthly PD sessions 
(five sessions)' 

72 

92 

56 

20 

29 

Study groups 

69 

84 

63 

25 

33 

End of year colloquia’ 

87 

96 

75 

n/a 

n/a 


Source: WestEd attendance logs for activities of treatment teachers in districts receiving the ETS induction 

program. 

**Data not available for orientations. Data available from four of nine districts for end-of-year colloquia. 

"Average of district averages across all five sessions. 

BT = beginning teacher. PD = professional development: n'a = not applicable. 

N = 259 teachers. 
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Table IV.6. Teacher Attendance at NTC Induction Activities (Percentages): 2005-2006 School Year 




Range of Average 
Attendance Across 
Districts 

Regularity of Attendance 

Activity 

Average 
Attendance of 
BTs 

High 

Low 

Teachers 
Missing No 
More Than 1 
Session 

Teachers 
Missing 3 or 
More Sessions 

Orientation 

51 

94 

26 

n/a 

n/a 

Monthly PD sessions 
(six sessions)* 

65 

93 

43 

23 

22 

End of year colloquia 

60 

96 

46 

n/a 

n/a 


Source: WestEd attendance logs (or activities of treatment teachers in districts receiving the NTC induction 

program. 

'Average of district averages across all si* sessions. 

BT = beginning teacher PD = professional development: m'a = not applicable. 

N = 247 teachers. 

End-of-Year Colloquium. The two- to three-hour colloquium in each district focused on 
celebrating the first year’s successes and teachers’ professional growth. It also encouraged teachers 
to set goals for improved instruction for the year ahead. Attendance at the cnd-of-ycar colloquia was 
similar to that of other events, with about two-thirds participation across the study (87 percent 
across ETS districts and 60 percent across NTC districts), but considerably lower and higher levels 
in some districts (ranging from 46 to 96 percent). 

b. Year 2 Program Services and Activities (2006-2007 School Year) 

A second year of program implementation was provided in 7 of the original 1~ districts. As in 
year 1 , mentoring of beginning teachers (those who were randomly assigned to treatment in year 1 
and were now in their second year of teaching) began during the first week of school and continued 
weekly throughout the year, with a similar structure. In addition to this, all treatment teachers were 
expected to participate in professional development sessions, as noted in Figure IV. 5. The IvTS 
district mentors also held monthly Teaching learning Community (Tl.C) meetings with their 
beginning teachers. In year 1, these meetings were called “study groups” and mentors primarily 
facilitated general peer support among their beginning teachers. In year 2, the meetings focused 
more on enhancing particular aspects of instruction. Beginning teachers also had release days to 
observe veteran teachers or to work with their mentors on other development tasks, just as they had 
in year 1. Similar to year 1, at the end of this second school year, beginning teachers participated in a 
colloquium. Each of these induction activities is described in more detail below. 
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Figure IV.5. Comprehensive Induction Program Activities (or Beginning Teachers for 2006-2007 School Year 


Start of 
School Year 


Month 3 


Month 5 


Month 7 


End of 
School Year 


ETS 


NTC 



Note: Activities common to both providers are shown on both sides of the horizontal divider between ETS and 

NTC. 


PD = professional development. 


Mentoring. Mentoring in the second year was similar to the support provided in the first year. 
Programs again expected mentors to allocate approximately two hours of contact each week with 
every beginning teacher in their caseload and to engage in the same kinds of mentor/ novice 
interactions described for year 1. The framework for F.TS mentors was again Pathwisc Induction 
Events, whereas N FC mentors again used the FAS. 

Professional Development. The I2TS and NTC! programs included between 35 and 42 hours 
of professional development for beginning teachers in year 2. ' In IiTS districts, a total of eight 
2-hour sessions were held, as well as two all-day sessions (in months one and four of the school 
year) and a release day for observation of other teachers. NTC! districts held one all-day session in 
month two or three, five 2-hour sessions throughout the year, and three release days for observation 


° Then was variation within anil between districts in the amount of time devoted to any partscuiar session, but the 
total time allocated in any district fell within this range. 
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of other teachers or individual work with mentors. ' As in year 1, session topics continued to he 
related to the mentors’ weekly work with their beginning teachers. 


Programs changed the content and conduct of the professional sessions during this second year 
to reflect the growth of mentors and beginning teachers and the evolution of their circumstances 
and needs. Although stall" of both programs traveled to districts to conduct the all-day sessions, 
mentors took the lead in carrying out the rest of the professional development sessions. Following 
the initial program-led sessions, mentors in each NTC! district fleshed out details of nationally 
assigned topics (for example, differentiation in instruction) and designed activities to reflect local 
needs, in consultation with the program leader and their district coordinator. As in year 1 , the NTC! 
sessions used active-learning activities. The IiTS Teacher Learning Communities were led by 
mentors and were an adaptation of the first year’s study groups during which beginning teachers met 
monthly to discuss their individual needs and practices. In year 2, the IiTS program provided 
specific content for each session and a formal structure for taking teachers through a cycle that 
consisted of (1) illustrating possible approaches for the instmetion, (2) having teachers try them out, 
and (3) debriefing the resulting experience in the next session. 


On average, the professional development sessions drew 62 percent and 58 percent of the 


beginning teachers over the course of the year, IiTS and NTC respectively ("Fable IV. 7). " The 
attendance at the all-day sessions in both programs generally was higher than at the two-hour 
sessions that were most often held after school: 75 percent and 79 percent for the first IiTS and 
NTC all-day sessions, and 55 percent for the second IiTS all-day session. Thirty-eight percent and 
27 percent of teachers (IiTS and NTC, respectively) participated in 80 percent or more of the 
sessions. Approximately one-third of teachers missed the majority (more than 50 percent) of the 
sessions (36 percent and 35 percent of IiTS and NTC teachers, respectively). 


Table IV. 8 lists the topics for the professional development sessions, by program. The topics 
for the first two NTC! sessions— communication with families and equitable instruction and student 
achievement — were extensions of topics introduced in year 1. NTC! selected these topics from an 
analysis of needs expressed by treatment teachers in an NTC-administcrcd survey in the latter part 
of the first year. The IiTS TI.C! sessions employed an existing IiTS professional development 
product. Keeping l Miming on Trask: Integrating Assessment with Instruction through Teacher I Miming 
Communities. The content of the product, described in Table IV.8, w r as introduced in the two all-dav 
professional development sessions; during their monthly TI.C! meetings, teachers then discussed the 
topics and whatever experiences they had in applying the practices in their classrooms. IiTS staff 
continually made minor but important adaptations of the product for specific use with beginning 
teachers in the study; for example, developing more elementary school examples than the standard 
product contained. 


” In one IiTS district, a single professional development session had to he cancelled because of unexpected local 
scheduling conflicts. 

M Westlid attendance logs are the source data for discussion of participation of beginning teachers in professional 
development sessions. 

" Average attendance ranged widely among the districts from 36 to 71 percent, and 48 to 74 percent (ETS and 
NTC., respectively). 
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Table IV.7. ETS and NTC Teacher Attendance: Professional Development Sessions and Colloquia: 2006-2007 
School Year 




Range of Average 
Attendance Across 
Districts 
(Percentage) 

Regularity of Attendance 
(Percentage) 

Activity 

BT Average 
Attendance 

High Low 

BTs Attending BTs Missing 

Most Sessions Most Sessions 


Monthly PD Sessions 


ETS (9 sessions) 

62 

71 

36 

38 

(miss 1-2 of 9) 

36 

(miss S-. of 9) 

NTC (5 sessions) 

58 

74 

48 

27 

(miss 1 of S( 

35 

(miss 3+ of 5) 

End of-Year Colloquium 

ETS 

61 

70 

29 

n/a 

n/a 

NTC 

60 

61 

58 

n/a 

n/a 


Source: Y/estEd attendance logs for actrrilies ol treatment teacJiers in districts rece*v«ig the induction program. 

BT • beginning teacher: PD • professional development, n/a • not applicable. 

N ■ 192 teachers (ETS) and 206 teachers (NTC). 

Observation of Veteran Teachers. Mentors arranged formal opportunities for beginning 
teachers to observe experienced teachers, with an attempt to select observations that would be 
relevant to the instructional goals of the beginning teachers. Both programs required one 
observation, but NTC' participants also could use another of their three release days for additional 
observations. UTS and NTC mentors provided similar types of guidance and observation 
debriefings as in the first year. 

End-of-Year Colloquium. As in the first year, the two- to three-hour cnd-of-ycar colloquia in 
each district focused on celebrating the year’s successes and teachers’ professional growth. They also 
encouraged teachers to set goals for improved instruction for the next school year. Attendance at 
the cnd-of-ycar colloquia was similar to that of other professional development events (61 percent 
and 60 percent of teachers, F.TS and NTC!, respectively), with notably higher and lower levels among 
individual districts (ranging from 29 to 70 percent). 
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Table IV.8. Topics for Professional Development Sessions, by Program 


ETS 

Expanded examination of 
framework for teaching 

This session is a review of the conceptual framework that shaped 
the ETS induction program in year 1 (see Table IV.2). 

Using evidence to inform 
practice: norms for 
teacher learning 
communities 

This session established a focus on teaching (versus providing 
general peer support). It also set norms for professional and 
interpersonal behavior during sessions, and a structure and 
timetable to use in each session. 

Using learning intentions 
to strengthen starts and 
ends of lessons 

This session focused on establishing clear expectations/goals for 
lessons and an assessment of goal attainment. 

Providing formative 
feedback 

This session focused on the range and frequency of written 
feedback provided on student assignments. 

Developing quality hinge 
questions 

This session focused on using optimal questioning strategies to 
engineer effective classroom discussions, questions, and learning 
tasks. 

Student self- and peer- 
assessment 

This session focused on the value of. and how to establish, clear 
scoring/grading rubrics. 

NTC 

Expanded examination of 
standards for teaching 

This session is a review of the six professional teaching standards. 

Strong parental 
relationships and 
communication 

This session focused on family-teacher conferences, general and 
specific strategies for communication with families, and ways to 
enlist and build partnerships with families. 

Equitable instruction and 
student achievement (the 
only full day session) 

This session focused on recognizing individual student needs, and 
analyzing student work to identify individual needs. 

Differentiated instruction 

This session focused on differentiating instruction to meet 
individual needs, by tailoring instructional materials and varying 
modes of instruction. 

Other topics' 

These sessions typically delved further into topics begun in prior 
sessions. 


Source: ETS: Keeping Learning on Track: NTC: varied proprietary documents from the induction program. 

'Identified in consultation with NTC staff and inspection of their data from year 1 participant survey. 


B. Treatment-Control Differences in Teacher Induction Services 

Wc do not compare comprehensive teacher induction to the absence of any support services 
for new teachers; rather, wc compare comprehensive teacher induction to the prevailing level of 
induction services in the selected districts. Wc used the control group to characterize the types and 
intensity of district and school support that beginning teachers in the study schools would normally 
receive in the absence of the experimental intervention. The intervention gave treatment teachers 
the opportunity to receive services through the comprehensive induction programs, but 
participation was voluntary. By comparing service receipt in the treatment group with that in the 
control group, wc derived estimates of the serv ice contrast, which provides the necessary context for 
understanding the results of analyses on teacher and student outcomes presented in Chapters V and 
VI. Estimates of the service contrast were computed using an ordinary least squares regression 
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model with district and grade fixed effects that accounted lor clustering of teachers within schools; 
weights were applied to adjust for survey nonresponse and the study design. One-year and two-year 
districts were analyzed separately. 

The induction activities surveys that were administered during fall 2005, spring 2006, tail 2006, 
spring 2007 (for teachers in two-year districts), fall 2007, and fall 2008 were used to characterize the 
induction services received by the treatment and control teachers during their first four years in the 
profession. W e examined differences in services received by treatment and control teachers both 
before and after the comprehensive induction sendees ended. Treatment teachers in one-year and 
two-year districts were offered the same usual district sendees as control teachers following the 
conclusion of the intervention (beginning in year 2 for treatment teachers in one-year districts and 
year 3 for treatment teachers in two-year districts). The examination of sendee usage in these later 
years is important to providing a full picture of the sendee contrast. Analysis of the services received 
by control teachers in the later years provides a description of typical district and school induction 
support following teachers' initial years in the classroom. Moreover, the analysis can show whether 
the intervention induced future changes in treatment teachers' usage of these sendees beyond what it 
would have been in the absence of the inten ention. 


This section presents differences in sendee receipt between treatment and control teachers in 
the following areas: mentor assignments, number and types of mentors, meeting time with mentors, 
mentor activities, areas of mentor guidance, observation and feedback, professional development 
activities, and professional development session topic areas. We first discuss results for one-year 
districts, followed by results for two-year districts. Findings arc shown in figures; Appendix B 
presents tables of treatment and control group means and sendee contrast estimates for each data 
collection period. 


Overall, results showed statistically significant differences in the amount, type, and content of 
induction support received by treatment and control teachers. The findings arc given below and 
described in more detail in the remainder of this chapter. 


• In one-vear districts, both treatment and control teachers reported receiving 
substantial induction support. However, treatment teachers, whose participation 
in the comprehensive induction program was voluntary, received more and 
different support than control teachers during the comprehensive induction 
program (their first year of teaching). Relative to control teachers, treatment teachers 
were more likely to have an assigned mentor (90 versus 70 percent in fall 2005, p-valuc 
0.000; 90 versus 72 percent in spring 2006, p-valuc 0.000) and spent more time per week 
meeting with their mentors (87 versus 67 minutes in fall 2005, p-valuc 0.007; 85 versus 
68 minutes in spring 2006, p-valuc 0.039); these differences were all statistically 
significant. There were additional statistically significant differences favoring treatment 


v ‘ Throughout this chapter we present remits from a large number of hypothesis tests. When conducting many 
independent hypothesis tests, a small percentage of results will he statistically significant even if no underlying 
relationship exists (5 percent, when using an alpha level of 0.05). To guard against this problem of multiple comparisons, 
we present all findings in context, discussing the non significance of the tests where we fail to reject the null hypothesis 
as will as the statistical significance of those that do. See the end of Chapter 11 for further discussion of this issue. 

'' Teachers who report not having a mentor are given a value of aero lor all variables related to time spent with a 
mentor and supports received from a mentor. 
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teachers in the time spent being observed by mentors during the last full week of 
teaching (34 versus 10 minutes in fall 2005, p- value 0.000; 27 versus 7 minutes in spring 
2006, p-value 0.000), the frequency of informal feedback during the three months prior 
to the survey (3.2 versus 2.4 times in fall 2005, p- value 0.000; 2.5 versus 1.9 times in 
spring 2006, p-value 0.001), whether a teacher worked with a study group of new 
teachers during the three months prior to the survey (66 versus 34 percent in fall 2005, 
p-value 0.000; 71 versus 29 percent in spring 2006, p-value 0.000), and whether a teacher 
observed others teaching in their classrooms during the three months prior to the survey 
(61 versus 44 percent in fall 2005, p-value 0.0(H); 68 versus 39 percent in spring 2006, p- 
value 0.000). Treatment teachers were also more likely to receive mentors’ guidance in 
9 of 10 areas covered by the survey during the last lull week of teaching. (Supporting 
details may be found in Appendix Tables B..3 anti B.8.) No adjustments for multiple 
comparisons were made in assessing the statistical significance of the results because 
such adjustments were unnecessary or inappropriate. The analysis examined 
132 comparisons during the comprehensive induction program for one-year districts, of 
which 75 were statistically significant and 6.6 would be significant by chance if the 
program had no cllcct. 

• In two-year districts, treatment and control teachers reported receiving 
substantial induction support as well. However, similar to the findings in one- 
year districts, treatment teachers received more and different support than control 
teachers during the comprehensive induction program (their first two years of 
teaching). Relative to control teachers, treatment teachers were more likely to have an 
assigned mentor (between tall 2005 and spring 2007, the percent of treatment teachers 
ranged Irom 80 to 96 and the percent of control teachers ranged from 34 to 79; p-valucs 
all 0.000) and spent more time per week meeting with their mentors (between fall 2005 
and spring 2007, time spent by treatment teachers ranged from 79 to 124 minutes and 
the time spent by control teachers ranged from 41 to 82 minutes; p-valucs ranged from 
0.087 to 0.(X)1); these differences were statistically significant with the exception of 
meeting time in spring 2006. There were additional statistically significant differences 
favoring treatment teachers: the time spent being observed by mentors during the last 
full week of teaching (between fall 2005 and spring 2007, the time spent by treatment 
teachers ranged from 19 to 38 minutes and the time spent by control teachers ranged 
from 7 to 17 minutes; p-valucs ranged from 0.000 to 0.003), the time spent watching 
mentors model lessons during the last lull week of teaching (between fall 2005 and 
spring 2007, the time spent by treatment teachers ranged from 10 to 16 minutes and the 
time spent by control teachers ranged from 4 to 10 minutes; p- values ranged from 0.003 
to 0.027), the frequency of informal tecdback during the three months prior to the 
survey (between fall 2005 and spring 2007, the frequency for treatment teachers ranged 
from 1.9 to 2.8 times and the frequency for control teachers ranged from 1.5 to 
2.5 times; p-valucs ranged Irom 0.001 to 0.266), and whether a teacher worked with a 
study group of new teachers during the three months prior to the survey (between fall 
2005 and spring 2007, the percent of treatment teachers ranged from 42 to 67 and the 
percent of control teachers ranged from 14 to 25; p-valucs all 0.000). Treatment teachers 
were also more likely to receive mentors’ guidance in all 10 areas covered by the survey 
during the last lull week of teaching. (Supporting details may be found in Appendix 
Tables B.18 and B.23.) No adjustments for multiple comparisons were made in assessing 
the statistical significance of the results because such adjustments were unnecessary or 
inappropriate. The analysis examined 264 comparisons during the comprehensive 
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induction program for onc-ycar districts, of which 70 were statistically significant and 
13.2 would be significant by chance it the program had no effect. 

• In their second year, immediately following the end of the comprehensive 
induction program, treatment teachers in one-year districts received less and 
different induction support than control teachers. For measures such as the 
percentage of teachers with an assigned mentor and time spent meeting with mentors 
per week, this reflects a significant drop in support among control teachers and an even 
larger significant drop in support among treatment teachers. A survey of teachers in onc- 
ycar districts conducted in fall 21)06 showed that there were statistically significant 
differences favoring the control teachers in several areas: treatment teachers were less 
likely than control teachers to have an assigned mentor (20 percent of treatment teachers 
versus 29 percent of control teachers, p-valuc 0.017), spent less time per week meeting 
with their mentors (19 minutes for treatment teachers versus 39 minutes for control 
teachers, p-valuc 0.002), spent less time being observed by mentors during the last full 
week of teaching (2 minutes by treatment teachers versus 6 minutes by control teachers, 
p-valuc 0.021), were less likely to work with a study group of new teachers during the 
three months prior to the survey (11 percent of treatment teachers versus 21 percent of 
control teachers, p-value 0.003), and were less likely to receive mentors’ guidance in all 
10 areas covered by the survey during the last lull week of teaching. (Supporting details 
may be found in Appendix Table B.3.) No statistically significant differences favoring the 
treatment teachers were found. No adjustments for multiple comparisons were made in 
assessing the statistical significance of the results because such adjustments were 
unnecessary or inappropriate. The analysis examined 66 comparisons during fall 2006 for 
one-year districts, of which 29 were statistically significant and 3.3 would be significant 
by chance if the program had no effect. 

• In the third and fourth years of teaching, after the intervention ended for all 
districts, treatment and control teachers received similar levels of support. In both 
one- and two-year districts, there were statistically significant differences in fewer than 
seven percent of the 134 measures wc surveyed. No adjustments for multiple 
comparisons were made in assessing the statistical significance of the results because 
such adjustments were unnecessary or inappropriate. The analysis examined 
132 comparisons during study teachers’ third and fourth years ot teaching for one-year 
districts, of which 8 were statistically significant anti 6.6 would be significant by chance if 
the prcigram had no effect. Wc also examined 132 comparisons during study teachers' 
third and fourth years of teaching for onc-ycar districts, of which 9 were statistically 
significant and 6.6 would be significant by chance if the program had no effect. 
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1. One-Year Districts 


a. Mentor Assignments 

A majority of both treatment and control teachers reported having a mentor assigned to them 
during the comprehensive induction program, but treatment teachers were significantly more likely 
than control teachers to report this support. Figure IV.6 illustrates the levels of support for 
treatment and control teachers over time. The differences were 90 versus 70 percent in tall 2005 and 
90 versus 72 percent in spring 2006. While 10 percent of treatment teachers do not report having a 
mentor assigned to them, all treatment teachers were assigned a mentor as part of the 
comprehensive induction program. 

Figure IV.6. Treatment-Control Differences in Percent Who Have an Assigned Mentor: One-Year Districts 



Fall 2005 Spnng 2006 Fall 2006 


Fall 2007 


Fall 2008 


Source: Mathematica First. Second. Third. Fifth, and Sixth Induction Activities Surveys administered in fal 2005. 

spring 2006. fall 2006. fall 2007. and fall 2008 to all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. N : 503 teachers in fall 2005. 

499 teachers in spring 2006. 472 teachers in fall 2006. 426 teachers in fall 2007. and 398 teachers in 

fall 2008. 

Treatment-control differences are significantly different from zero at the 0.05 level except in fall 2007 and fall 2008. 

The percent of both treatment and control teachers with an assigned mentor dropped 
significantly from spring 2006 to fall 2006 (p-vaiucs both 0.000), after the comprehensive induction 
program ended. The drop was larger for treatment teachers; treatment teachers were significantly less 
likely than control teachers to report having an assigned mentor (20 versus 29 percent) during the 
fall alter the comprehensive induction program ended. By fall 2007 and also in tall 2008, differences 
between treatment and control teachers were insignificant (11 versus 14 percent and 8 versus 
8 percent). Differences between treatment and control teachers in the likelihood ot having any 
mentor followed a similar pattern over time. 
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b. Number and Types of Mentors 

Treatment teachers’ number of mentors and mentor profiles differed significantly from those of 
control teachers during the comprehensive induction program. Treatment teachers were significantly 
more likely than control teachers to report having multiple mentors (25 versus 15 percent in fall 
2005 and 22 versus 12 percent in spring 2006), having two mentors assigned to them (19 versus 
7 percent in fall 2005 and 18 versus 7 percent in spring 2006), and having a full-time mentor 
(74 versus 8 percent in fall 2005 and 72 versus 10 percent in spring 2006). Also in this first year, 
treatment teachers were significantly less likely than control teachers to report having a mentor who 
was a teacher (25 versus 64 percent in fall 2005 and 25 versus 66 percent in spring 2006). 

In the fall after the comprehensive induction program ended, treatment teachers were 
significantly less likely than control teachers to report having two assigned mentors (2 versus 
6 percent) or having a mentor who was a teacher (21 versus 31 percent). Treatment and control 
teachers did not differ significantly in having multiple mentors or a lull-time mentor. The number 
and ripe of treatment and control teachers’ mentors were not significantly different in fall 2007 and 
fall 2008. 

c. Meetings with Mentors 

Treatment and control teachers both spent more than an hour a week in mentor meetings and 
activities during the comprehensive induction program, but treatment teachers spent significantly 
more time than control teachers, figure IV. 7 illustrates average total weekly mentor meeting time 
for treatment and control teachers over time. Combining usual scheduled time and informal time 
during the most recent full week of teaching, we found that treatment teachers spent an average of 
87 minutes in mentor meetings compared to 67 minutes for control teachers in fall 2005, and 
85 versus 68 minutes in spring 2006. Since total meeting time was not reported directly but had to 
be constructed from reports of the frequency and duration of usual scheduled meetings and the time 
spent in informal meetings, we could not determine precisely whether treatment teachers met with 
their study mentors for two hours per week as the MTS and NTC programs expected. The 
statistically significant differences in meeting time (20 minutes and 17 minutes) were attributable 
entirely to differences in the duration of the usual scheduled meetings (23 versus 10 minutes in fall 

2005 and 23 versus 1 1 minutes in spring 2006). Differences in total meeting time in fall 2005 and fall 

2006 arc shown separately by district in Figures B.l— B.2 in Appendix B. Details of the method that 
generates these results can be found in Appendix A. 

Differences in the time spent with full-time mentors and mentors who were teachers reflected 
differences in the types of mentors that treatment and control teachers reported. Treatment teachers 
reported spending significantly more time during the most recent week of teaching meeting with 
full-time mentors than did control teachers (60 versus 4 minutes in fall 2005 and 52 versus 6 minutes 
in spring 2006). However, they reported spending significantly less time than control teachers with 
mentors who were teachers (23 versus 59 minutes in fall 2005 and 26 versus 59 minutes in spring 
2006). Figure IV. 8 illustrates average minutes each week meeting with mentors who were teachers 
for treatment and control teachers over time. 
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Figure IV.7. Treatment-Control Differences in Total Minutes Spent in Mentoring per Week: One-Year Districts 



Tr=atir,'fv: » Control 


Fall 2005 Spring 2006 Fai 2006 Faa 2007 Fad 2008 

Source: Mathematca First. Second. Third. Fifth, and Sixth Induction Activities Surveys administered in fal 2005. 

spnng 2006. fall 2006. fall 2007. and fall 2008 to all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. N=503 teachers in fall 2005. 

499 teachers in spring 2006. 472 teachers in fall 2006. 426 teachers in fall 2007. and 398 teachers in 

fall 2008. 

Treatment-control differences are significantly different from zero at the 0.05 level except in fall 2007 and fall 2008. 

In the fall after the comprehensive induction program ended, the time spent by both treatment 
and control teachers in mentor meetings dropped significantly (p-vaiucs both 0.000) to less than 
40 minutes per week. I Iowcvcr, the drop was larger for treatment teachers, who consequently spent 
significantly less total time than control teachers in mentor meetings (19 versus 39 minutes). Negative 
differences were found in both components of total mentor meeting time: usual scheduled meetings 
(10 versus 18 minutes) and informal meetings (9 versus 20 minutes). The significant difference in 
total meeting time was due to treatment teachers spending less time than control teachers with 
mentors who were teachers (17 versus 33 minutes) or mentors who were administrators (0 versus 
2 minutes), since the differences in meeting time with full-time mentors (1 versus 3 minutes) and 
staff external to the district (1 versus 0 minutes) were insignificant. Two to three years after the 
comprehensive induction program was implemented (fall 2007 and fall 2008), differences between 
treatment and control teachers were insignificant in terms of total mentor meeting time (13 versus 
23 minutes in fall 2007 and 13 versus 11 minutes in fall 2008) as well as time spent with full-time 
mentors and mentors who were teachers. 
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Figure IV.8. Treatment-Control Differences in Total Minutes Spent in Mentoring per Week with Mentors Who 
Are Teachers: One-Year Districts 



Fall 2005 Spring 2006 Fall 2006 Fafl 2007 Fafl 2008 


Source: Mathematics First. Second. Third. Fifth, and Sixth Induction Activities Surveys administered in fal 2005. 

spnng 2006. fall 2006. fall 2007. and fall 2008 to all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. N = 503 teachers in fall 2005. 

499 teachers in spring 2006. 472 teachers in fall 2006. 426 teachers in fall 2007. and 398 teachers in 

fall 2008. 

Treatment-control differences are significantly different from zero at the 0.05 level except in fall 2007 and fall 2008. 
d. Mentor Activities 

During the comprehensive induction program, treatment teachers reported spending 9 to 
34 minutes in each of six types of mentoring activities during the most recent full week of teaching 
while control teachers reported spending 6 to 23 minutes. Figure IV.9 illustrates differences in levels 
of support for treatment and control teachers over time. A positive difference indicates that 
treatment teachers spent more time than control teachers in an activity, while a negative difference 
indicates that they spent less time; solid circles indicate that the difference is statistically significant. 

Treatment teachers spent significandy more time than control teachers in certain of these 
activities during the comprehensive induction program: being observed by mentors (.34 versus 
10 minutes in tall 200S and 27 versus 7 minutes in spring 2<X16), meeting one-on-one with mentors 
(34 versus 2.3 minutes in fall 2(X)5 and 31 versus 21 minutes in spring 2006), meeting with mentors 
together with other first-year teachers (29 versus 9 minutes in fall 2005 and 24 versus 6 minutes in 
spring 2006), and having mentors model lessons (9 versus 6 minutes in fall 2005; the difference is 
insignificant in spring 2006). Treatment teachers did not spend significantly more time than control 
teachers meeting together with mentors and other teachers or co-teaching lessons with mentors. The 
total time spent in the six types of activities covered in the survey averaged 130 minutes per week for 
treatment teachers and 67 minutes per week for control teachers in fall 2005, a significant difference 
of 6.3 minutes per week. In spring 2006, the total time was 108 minutes for treatment teachers and 
59 minutes for control teachers, a significant difterence of 49 minutes. 


65 



IV. Program Implemtutation 


Figure IV.9. Treatment-Control Differences in Time Spent in Six Mentoring Activities in the Last Full Week of 
Teaching: One-Year Districts' 


30 



Fall 2005 Spring 2006 Fall 2006 Fa« 2007 Fan 2008 

Source: Mathematica First. Second. Third. Fifth, and Sixth Induction Activities Surveys administered in fal 2005. 

spring 2006. fall 2006. fall 2007. and fall 2008 to all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Solid (open) circles indicate that 

treatment-control differences are (not) statistically significant at the 0.05 level. In fall 2005. E is 
significant But D is not. In fall 2006. A is significant but D is not. In fall 2007. E is significant But D is not 
(N = 503 teachers in fall 2005. 499 teachers in spring 2006. 472 teachers in fall 2006. 426 teachers in 
fall 2007. and 398 teachers in fall 2008). 

Legend: A = Observing BT teaching, B = Meeting with BT one on-one, C = Meeting with BT and other 
first-year teachers, D = Meeting with BT and other teachers, E = Modeling a lesson, F = Co teaching a 
lesson., BT = Beginning Teacher. 

In contrast, after the comprehensive induction program ended, treatment teachers reported 
spending 0 to 7 minutes per week in each of six types of mentoring activities during the most recent 
full week of teaching across fall 2006, fall 2007, and fall 2008, while control teachers reported 
spending 0 to 10 minutes. Treatment teachers spent significantly /ess total time than control teachers 
in all six activities covered in the fall 2006 and fall 2007 surveys (22 versus 36 minutes per week in 
fall 2006 and 12 versus 24 minutes per week in fall 2007). Differences in the time spent in each 
individual activity were insignificant, with two exceptions across the 12 activities and periods: 
treatment teachers spent less time being observed by mentors in fall 2006 (2 versus 6 minutes per 
week) and less time watching mentors model a lesson in fall 2007 (0.2 versus 3 minutes per week). 
Tlic re were no significant differences in the total time spent in all six activities or in the time spent in 
each activity in fall 2008. 
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e. Areas of Mentor Guidance 

During the comprehensive induction program, a majority of treatment teachers reported 
receiving mentors’ assistance during the last full week of teaching in all 10 topic areas covered in the 
survey and in hoth tall 2(105 and spring 2006, while about 40 to 70 percent of control teachers 
reported receiving assistance. Figure IV.10 illustrates diflerenccs in levels of support lor treatment 
and control teachers over time. The percentage of treatment teachers receiving each type of 
assistance ranged Irom 56 percent (sharing lesson plans, assignments, anti other instructional 
activities) to 87 percent in fall 2005, and 53 percent (receiving help with teaching to meet state or 
district standards) to 78 percent in spring 2006. Among control teachers, the percentage reporting 
each type of assistance ranged from 44 percent (receiving guidance on how to assess students) to 
66 percent in tail 2005, and 4 1 percent (discussing instructional goals and ways to achieve them) to 
68 percent in spring 2006. The area in which hoth treatment and control teachers received the most 
guidance in each period was encouragement or moral support. 

Mxamining dillercnces in service receipt during the comprehensive induction program, we 
found treatment teachers were significantly more likely than control teachers to report receiving 
mentors’ assistance during the last full week of teaching in all 10 topic areas covered in the survey, 
with one exception across the 20 areas and periods: treatment teachers were less likely than control 
teachers to have mentors share lesson plans, assignments, or other instructional activities in fall 
2005. Significant differences ranged from 14 percentage points (receiving help with administrative 
and logistical issues) to 27 percentage points (receiving help identifying teaching challenges and 
solutions) in fall 2005 and 9 percentage points (sharing lesson plans, assignments, or other 
instructional activities) to 20 percentage points (discussing instructional goals and ways to achieve 
them) in spring 2006. Discussing instructional goals anti ways to achieve them were among the two 
areas of guidance with the largest impacts during both the fall and spring of the comprehensive 
induction program; the following areas of guidance were among the five with the largest impacts in 
hoth the fall and spring: receiving suggestions to improve practice, having opportunities to raise 
issues and discuss concerns, and having the mentor act on a teacher’s request. 

During the tall after the comprehensive induction program ended, fewer than half of treatment 
or control teachers reported receiving each type of assistance during the last full week of teaching. 
Among treatment teachers, the percentage reporting each type of assistance ranged from 1 1 percent 
(receiving guidance on how to assess students) to 21 percent. Among control teachers, the 
percentage reporting each type of assistance ranged Irom 19 percent (receiving help teaching to state 
standards) to 33 percent. The area in which both treatment anti control teachers received the most 
guidance in each period was encouragement or moral support. I Estimating differences in service 
receipt in fall 2006, we found that treatment teachers were significantly less likely than control 
teachers to report receiving mentors' assistance in each of the 10 topic areas the survey covered. 
Differences ranged Irom 8 percentage points (receiving help teaching to meet state or district 
standards) to 14 percentage points (having opportunities to raise issues or discuss concerns). 

In fall 2007 and fall 2008, no more than 20 percent ol treatment or control teachers reported 
receiving each type of guidance. The percentage of treatment teachers receiving each type of 
guidance ranged from 7 percent (help teaching to meet state or district standards, having mentor act 
on the teacher’s request) to 18 percent in fall 2007 and from 4 percent (help teaching to meet state 
or district standards) to 1 1 percent in fall 2008. Among control teachers, the percentage receiving 
each type of guidance ranged from 14 percent (receiving suggestions to improve practice) to 
20 percent (encouragement and moral support, having an opportunity to raise issues or discuss 
concerns) in fall 2007 and from 5 percent (receiving guidance on how to assess students) to 
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1 1 percent in fall 2008. The area in which both treatment and control teachers received the most 
guidance in each period was encouragement or moral support. Turning to the service contrast, 
differences between treatment and control teachers in the receipt of guidance were insignificant in 
fall 2007 and fall 2008, with two exceptions across the 20 areas and periods: treatment teachers were 
less likely to receive guidance in teaching to meet state or district standards {7 versus 16 percent) or 
to have their mentors act on something they requested (7 versus IS percent) in tail 2007. 

f. Observations and Feedback 

During the comprehensive induction program, both treatment and control teachers reported 
receiving two types of observations and three types of feedback at least once during the three 
months prior to the survey. Figure IV. 1 1 illustrates differences in levels of support for treatment and 
control teachers over time. 

Treatment teachers received certain types of observations and tccdback significantly more 
frequently than control teachers during the comprehensive induction program. ” Treatment teachers 
were significantly more frequently observed by mentors (4.0 versus 1.5 times in fall 2IX >5 and 

3.5 versus 1.5 times in spring 2006) and more Ircquently given informal feedback on teaching 
(3.2 versus 2.4 times in fall 2005 and 2.5 versus 1.9 times in spring 2006). Treatment and control 
teachers did not differ significantly in the frequency with which they were observed by their 
principals (2.3 versus 2.6 times in fall 2005 and 1.9 versus 2.1 times in spring 2006) or the frequency 
with which they received feedback on lesson plans (1.6 versus 1.7 times in fall 2005 and 1.3 versus 

1.5 times in spring 2006) or as part of a formal evaluation (1.7 versus 1.4 times in tall 2005 and 

1.6 versus 1.4 times in spring 2006) during the intervention. 

After the comprehensive induction program ended, the frequency of the two types of 
observations during the three months prior to the survey ranged from 0 to 2 times anti the 
frequency of the three types of feedback ranged from 1 to 2 times for both treatment and control 
teachers. Treatment and control teachers did not differ significantly in the frequency with which 
they engaged in these activities with one exception across the 15 activities and periods: treatment 
teachers were significantly lets likely to be observed by a mentor in fall 2(XH> (0.3 versus 0.6 times). 


* Reported levels of service receipt in the fall may be low relative to the spring as the three months prior to the 
survey may include part of the summer vacation for some teachers. The percent of teachers completing fall Induction 
Activities surveys prior to December, anti thus reflecting on a period that begins before September 1st, is 32 percent in 
fall 2fXI5, 83 percent in fall 2006, 75 percent in fall 2007, and 79 percent in fall 2008. The earliest surveys completed were 
on November 11, 2<X>5, October 3, 2006, October 3, 2007, anti September 8, 2008. However, the timing of fall survey 
completion does not affect the impact estimates as both treatment anti control teachers completed their surveys at the 
same time on average. This issue pertains as well to participation in professional development activities anil topic area 
sessions in t me year districts and to observations anti feedback and participation in professional development activities 
and topic area sessions in two-year districts. 
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Figure IV.10. Treatment -Control Differences in Percent of Teachers Who Received 10 Types of Mentor 
Guidance in the Last Full Week of Teaching: One-Year Districts 
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Source: Mathematics First. Second. Third. Fifth, and Sixth Induction Activities Surveys administered in fall 2005. 

spring 2006. fall 2006. fall 2007. and fall 2008 to all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Solid (open) circles indicate that 

treatment-control differences are (not) statistically significant at the 0.05 level N = 503 teachers in fall 
2005. 499 teachers in spring 2006. 472 teachers in fal 2006. 426 teachers in fall 2007. and 
398 teachers in fall 2008. 

Legend: A = Suggestions to improve practice. B = Encouragement or moral support. C - Opportunity to raise 
issuesi'discuss concerns. D = Help with administrativei'logistical issues. E = Help with teaching to meet state or district 
standards. F = Help identifying teaching challenges and solutions. G = Discussed instructional goals and ways to 
achieve them. H = Guidance on how to assess students. I = Shared lesson plans, assignments, or other instructional 
activities. J = Acted on beginning teacher s request. 


g. Professional Development Activities 

During the comprehensive induction program, both treatment and control teachers reported 
participating in nine professional development activities during the three months prior to the survey. 
Among treatment teachers, participation rates across the 18 activities and periods ranged from 38 to 
78 percent and did not reach 100 percent for any activity. Among control teachers, these 
participation rates ranged from 29 to 78 percent, f igure 1V.12 illustrates differences in levels of 
support for treatment and control teachers over time. 
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Figure IV.11. Treatment-Control Differences in the Frequency of Selected Activities During Past Three 
Months: One-Year Districts 
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Fall 2005 Spring 2CC6 Fall 2006 Fall 2007 Fall 200B 

Source: Mathematics First. Second. Third. Fifth, and Sixth Induction Actrvrties Surveys administered in fall 2005. 

spnng 2006. fall 2006. fall 2007. and fall 2008 to all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Solid (open) circles indicate that 

treatment-control differences are (not) statistically significant at the 0.05 level. In fall 2006. A is 
significant but E is not. N = 503 teachers in fall 2005. 499 teachers in spring 2006. 472 teachers in fall 
2006. 426 teachers in fall 2007. and 398 teachers in fall 2008. 

Legend: A = Teaching was observed by mentor. B = Teaching was observed by principal. C = Grven feedback on 
your teaching, not as part of formal evaluation. D = Given feedback on your teaching as part of formal evaluation. 
E = Given feedback on your lesson plans. 

Treatment teachers were more likely than control teachers to report participating in certain of 
these activities during the comprehensive induction program. Treatment teachers were significantly 
more likely than control teachers to report working with study groups ot new teachers (66 versus 
34 percent in fall 2005 and "1 versus 29 percent in spring 2(106) and observing others teaching in 
their classrooms (6 1 versus 44 percent in fall 2005 and 68 versus 39 percent in spring 2006) at both 
time points during year 1, and they were significantly more likely than control teachers to report 
keeping written logs at one time point (38 versus 29 percent in spring 2006). I lowever, treatment 
teachers were significantly less likely than control teachers to report meeting with a resource 
specialist to discuss the needs of particular students (66 versus 77 percent) in fall 2005. Treatment 
and control teachers did not differ significantly in their participation in the other five activities 
covered in the survey (keeping a portfolio and analysis of student work, working with a study group 
of new and experienced teachers, observing others teaching the beginning teacher’s class, and 
meeting with a principal or with a literacy or mathematics coach or other curricular specialist). 
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Figure IV. 12. Treatment-Control Differences in Percent of Teachers Who Completed Nine Types of PD 
Activities During Past Three Months: One-Year Districts 





-0.5 


Fall2C05 Spring 2CC6 Fall 2006 Fall 2007 Fall 200B 

Source: Mathematics First. Second. Third. Fifth, and Sixth Induction Activities Surveys administered in fall 2005. 

spring 2006. fall 2006. fall 2007. and fall 2008 to al study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Solid (open) cirdes indicate that 

treatment-control differences are (not) statistically significant at the 0.05 level N = 503 teachers in fall 
2005. 499 teachers in spring 2006. 472 teachers in fal 2006. 426 teachers in fall 2007. and 
398 teachers in fall 2008. 

Legend: A = Kept written log. B - Kept portfolio and analysis of student work. C = Worked with study group of new 
teachers. D = Worked with study group of new and experienced teachers. E = Observed others teaching m their 
classrooms. F = Observed others teaching your class. G = Met with principal to discuss teaching. H = Met with 
literacy or mathematics coach or other curricular specialist. I = Met with a resource specialist to discuss needs of 
particular students. 

After the comprehensive induction program ended, participation ranged widely from 13 to 
78 percent among treatment teachers and from 12 to 80 percent among control teachers across the 
27 activities and periods. In contrast to the differences in service receipt during the comprehensive 
induction program, there were no significant differences in participation between treatment and 
control teachers after the program ended in the nine activities covered in the survey with two 
exceptions across the 27 activities and periods: treatment teachers were significantly less likely to 
work with a study group of new teachers in fall 2006 (11 versus 21 percent) or to observe others 
teaching in the beginning teacher’s classroom in fall 2008 (25 versus 35 percent). 

h. Professional Development Topic Areas 

During the comprehensive induction program, treatment teachers' reported attendance at 
professional development sessions in 12 topic areas during the three months prior to the survey 
ranged widely from 21 to 78 percent and did not reach 100 percent for any topic area. The range of 
reported attendance among control teachers was 21 to 74 percent. Figure IV. 13 illustrates 
differences in levels of support for treatment and control teachers over time. For both treatment 
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and control teachers in both fall 2005 and spring 2006, the most attended topic area sessions were 
on instructional techniques while the least attended sessions were on understanding the composition 
of students in the teacher’s class. 

Figure IV.13. Treatment-Control Differences in Percent of Teachers Who Attended PD in 12 Topic Areas 
During the Past Three Months: One-Year Districts 



Fall 2005 Spring 2006 Fall 2006 Fall 2007 Fall 2008 

Source: Mathematica First. Second. Third. Fifth, and Sixth Induction Activities Surveys administered in fall 2005. 

spring 2006. fall 2006. fall 2007. and fall 2008 to all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Solid (open) circles indicate that 

treatment-control differences are (not) statistically significant at the 0.05 level N = 503 teachers in fall 
2005. 499 teachers in spring 2006. 472 teachers in fal 2006. 426 teachers in fall 2007. and 
398 teachers in fall 2008. 

Legend: A = Parent and community relations. B = School policies on student disciplinary procedures. 
C = Instructional techniques/strategies. D = Understanding the composition of students in your class. E = Content 
area knowledge (language arts, mathematics, science). F = Lesson planning. G = Analyzing student 
wofo'assessment. H = Student motivatiorv'engagemenL I = Differentiated instruction. J = Using computers to support 
instruction. K = Classroom management techniques. L = Preparing students for standardized testing. 

Treatment and control teachers did not differ significantly in their reported attendance at 
professional development sessions in 12 topic areas during the three months prior to the survey 
while the comprehensive induction program was heing implemented, with five exceptions across the 
24 areas and periods. Treatment teachers were significantly more likely to attend professional 
development sessions in lesson planning (33 versus 22 percent) in spring 2006 and analyzing student 
work and assessment (52 versus 43 percent) in spring 2006. Treatment teachers were significantly 
less likely to attend sessions in content area knowledge (61 versus 72 percent) in fall 2005, preparing 
students for standardized testing (30 versus 41 percent) in tall 2005, and school disciplinary policies 
(32 versus 45 percent) in spring 2006. 

After the comprehensive induction program ended, participation rates remained variable among 
topic area sessions for both treatment and control teachers. Participation rates ranged from 12 to 
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77 percent among treatment teachers and from 1 5 to 82 percent among control teachers across the 
36 areas and periods. For both treatment and control teachers in fall 2006, fall 2007, and fall 2008, 
the most attended topic area sessions were on instructional techniques while the least attended 
sessions were on parent and community relations. 

Treatment and control teachers did not differ significantly in their reported attendance at 
professional development sessions after the comprehensive induction program ended, with two 
exceptions across the 36 areas and periods: treatment teachers were significantly less likely than 
control teachers to attend PD in differentiated instruction in fall 2007 (42 versus 55 percent) and 
analyzing student work and assessment in fall 2008 (42 versus 56 percent). 

2. Two-Year Districts 

a. Mentor Assignments 

A majority of both treatment and control teachers reported having a mentor assigned to them 
during the comprehensive induction program, but treatment teachers were significantly more likely 
than control teachers to report this support. Figure IV. 14 illustrates these levels of support for 
treatment and control teachers over time. The differences were 94 versus 79 percent in fall 2005, 
96 versus 79 percent in spring 2006, 80 versus 34 percent in fall 2006, and 84 versus 40 percent in 
spring 2007. As in the one-year districts, while some treatment teachers do not report having a 
mentor assigned to them, all treatment teachers were assigned a mentor as part of the 
comprehensive induction program. 

The percent of both treatment and control teachers with an assigned mentor dropped 
significantly from spring 2007 to fall 2007 (p-values both 0.000), after the comprehensive induction 
program ended, to less than 20 percent. In fall 2007, 1 1 percent of treatment teachers versus 
19 percent of control teachers reported having an assigned mentor; these rates were 7 versus 
12 percent in fall 2008. Differences between treatment and control teachers in the likelihood of 
having any mentor follow a similar pattern over time, but with a negative and significant difference 
in fall 2008. 

b. Number and Types of Mentors 

During the first year of the comprehensive induction program, treatment teachers were 
significantly more likely than control teachers to report having multiple mentors (.38 versus 
23 percent in fall 2005 and 38 versus 22 percent in spring 2006) and having two as signet! mentors 
(31 versus 13 percent in fall 2005 and 31 versus 17 percent in spring 2006). During the second year 
of the comprehensive induction program and after the comprehensive induction program ended, 
however, treatment-control differences in these measures were insignificant. 
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Figure IV.14. Treatment-Control Differences in Percent Who Have an Assigned Mentor: Tv/o-Year Districts 



Fall 2005 Spring 2006 Fall2CC6 Fall 2007 Fall 2003 


Source: Mathematics First. Second. Third. Fifth, and Sixth Induction Activities Surveys administered in fall 2005. 

spring 2006. (all 2006. fall 2007. and fall 2008 to all study teachers and Fourth Induction Activities 
Survey administered in spring 2007 to study teachers in two-year districts. 

Note: Data pertain to teachers m two-year districts participating in the study. N = 395 teachers in fall 2005. 

386 teachers in spring 2006. 360 teachers in fall 2006. 372 teachers in spring 2007. 326 teachers in fall 
2007. and 321 teachers in fall 2008. 

Treatment-control differences are significantly different from zero at the 0.05 level except in fall 2007 and fall 2008. 

Differences between treatment and control teachers in their mentors’ positions were significant 
during the comprehensive induction program. Treatment teachers were significantly more likely than 
control teachers to report having a full-time mentor (72 versus 16 percent in fall 2005, 75 versus 
17 percent in spring 2006, 64 percent versus 7 percent in fall 2006, and 67 versus 15 percent in 
spring 2<X)7) anti significantly less likely to report having a mentor who was a teacher (38 versus 
62 percent in fall 2005, 39 versus 65 percent in spring 2006, 12 versus 27 percent in fall 2006, and 

16 versus 27 percent in spring 2007). After the comprehensive induction program ended, treatment 
teachers were significantly less likely to have a mentor who was a teacher in fall 2007 (8 versus 

17 percent), hut they did not differ significantly from control teachers in having a full-time mentor 
in either fall 2007 or fall 2008. 


c. Meeting Time with Mentors 

Treatment and control teachers both spent time per week in mentor meetings and activities 
during the comprehensive induction program, but treatment teachers spent significantly more rime 
than control teachers in three out of four periods: from 79 to 124 minutes, compared to 41 to 
82 minutes for control teachers. Figure IV.15 illustrates average total weekly mentor meeting time 
for treatment and control teachers over rime. Combining usual scheduled time and informal time 
with all mentors during the most recent full week of teaching, we found that treatment teachers 
spent an average of 124 minutes in mentor meetings versus 81 minutes for control teachers in fall 
2005, 108 versus 82 minutes in spring 2006, 82 versus 48 minutes in fall 2006, and 79 versus 
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41 minutes in spring 2007. Since total meeting time was not reported directly but had to be 
constructed from reports of the frequency and duration of usual scheduled meetings and the time- 
spent in informal meetings, we could not determine precisely whether treatment teachers met with 
their study mentors for two hours per week as the MTS and NTC programs expected. The 
statistically significant differences in meeting time (43 minutes in tall 2005, 34 minutes in fall 2006, 
and 38 minutes in spring 2007) were attributable entirely to differences in the duration of the usual 
scheduled meetings (24 versus 12 minutes in tall 2005, 19 versus 7 minutes in fall 2006, and 
20 versus 6 minutes in spring 2007). Dittercnccs in total meeting time in fall 2005 and fall 2006 arc 
shown separately by district in f igures B.3— B.4 in Appendix B. 

Figure IV. 15. Treatment-Control Differences in Total Minutes Spent in Mentoring per Week: Two-Year 
Districts 
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Source: Mathematica First. Second. Third. Fifth, and Sixth Induction Activities Surveys administered in fal 2005. 

spring 2006. fall 2006. fall 2007. and fall 2008 to all study teachers and Fourth Induction Activities 
Survey administered in spring 2007 to study teachers in two-year districts. 

Note: Data pertain to teachers in two-year districts participating in the study. N = 395 teachers in fall 2005. 

386 teachers in spring 2006. 360 teachers in fall 2006. 372 teachers in spring 2007. 326 teachers in fall 
2007. and 321 teachers in fall 2008. 

Treatment-control differences are significantly different from zero at the 0.05 level except in spring 2006. fall 2007. 
and fall 2008 (N = 395 teachers in fall 2005. 386 teachers in spring 2006. 360 teachers in fall 2006. 372 teachers in 
spnng 2007. 326 teachers in fall 2007. and 321 teachers in fall 2008). 
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Figure IV.16. Treatment-Control Differences in Total Minutes Spent in Mentoring per Week with Mentors Who 
Are Teachers: Two-Year Districts 



Source: Mathematica First. Second. Third. Fifth, and Sixth Induction Activities Surveys administered in fal 2005. 

spring 2006. fall 2006. fall 2007. and fall 2008 to all study teachers and Fourth Induction Activities 
Survey administered in spring 2007 to study teachers in two-year districts. 

Note: Data pertain to teachers in two-year districts participating in the study. N = 395 teachers in fall 2005. 

386 teachers in spring 2006. 360 teachers in fall 2006. 372 teachers in spring 2007. 326 teachers in fall 
2007. and 321 teachers in fall 2008. 

Treatment-control differences are significantly different from zero at the 0.05 level except in spring 2007. fall 2007. 
and fall 2008. 

Differences in the time spent with full-time mentors and mentors who were teachers reflected 
differences in the types of mentors that treatment and control teachers reported. Treatment teachers 
reported spending significantly more time during the most recent week of teaching meeting with 
full-time mentors than did control teachers (75 versus 6 minutes in tall 21)05, 71 versus 10 minutes in 
spring 2006, 59 versus 2 minutes in fall 2006, anti 54 versus 6 minutes in fall 2007). I lowcvcr, they 
reported significantly less time than control teachers with mentors who were teachers (39 versus 
70 minutes in fall 2005, 32 versus 70 minutes in spring 2006, 14 versus 42 minutes in fall 2006, and 
19 versus 23 minutes in spring 2007). f igure 1V.16 illustrates average minutes each week meeting 
with mentors who were teachers for treatment and control teachers over time. 

After the comprehensive induction program ended, the time spent by both treatment and 
control teachers in mentor meetings dropped significantly (p-valucs both 0.000) to less than 
15 minutes per week. There were no significant differences between treatment and control teachers 
in the time spent meeting with mentors during this period (12 versus 15 minutes in fall 2007 and 
11 versus 18 minutes in fall 2008). No significant differences were found in either component of 
total mentor meeting time, usual scheduled meetings (7 versus 9 minutes in tall 200~ and 6 versus 
9 minutes in fall 2008) or informal meetings (6 versus 6 minutes in fall 2007 and 5 versus 9 minutes 
in fall 2008). Differences between treatment and control teachers were also insignificant during this 
period in terms of time spent with full-time mentors and mentors who were teachers. 
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d. Mentor Activities 

During the comprehensive induction program, treatment teachers reported spending 7 to 
43 minutes in each of six types of mentoring activities during the most recent full week of teaching 
while control teachers reported spending 2 to 23 minutes. For both treatment and control teachers, 
time spent varied across activities and time periods. Figure IV.17 illustrates differences in levels of 
support for treatment and control teachers over time. 

Figure IV.17. Treatment-Control Differences in Time Spent in Six Mentoring Activities in the Last Full Week of 
Teaching: Two-Year Districts 


30 i 



Fall 2005 Spring 2036 Fall 2006 Spring 2007 Fall 2C07 Fall 200B 

Source: Mathemalca First. Second. Third. Fifth, and Sixth Induction Activities Surveys administered in fall 2005. 

spring 2006. fall 2006. fall 2007. and fall 2008 to all study teachers and Fourth Induction Activities 
Survey administered in spring 2007 to study teachers in two-year districts. 

Note: Data pertain to teachers in two-year districts participating in the study. Solid (open) circles indicate that 

treatment-control differences are (not) statistically significant at the 0.05 level. In fall 2005. E is 
significant but D is not. N = 395 teachers in fall 2005. 386 teachers in spring 2006. 360 teachers in fall 
2006. 372 teachers in spring 2007. 326 teachers in fall 2007. and 321 teachers in fall 2008. 

Legend: A = Observing BT teaching. B = Meeting with BT one-on-one. C = Meeting with BT and other first-year 
teachers. D = Meeting with BT and other teachers. E = Modeling a lesson. F = Co-teaching a lesson. BT = Beginning 
teacher. 


Treatment teachers spent more time than control teachers in certain of these activities during 
certain time periods during the comprehensive induction program: being observed by mentors 
(.38 versus 17 minutes in fail 2(105, 26 versus 16 minutes in spring 2006, 22 versus 7 minutes in fall 
2006, and 19 versus 8 minutes in spring 2007), meeting one-on-one with mentors (43 versus 
23 minutes in fail 2005, 38 versus 21 minutes in spring 2006, 25 versus 12 minutes in fall 2006, and 
29 versus 10 minutes in spring 2007), meeting together with mentors and other first-year teachers 
(38 versus 1 1 minutes in fall 2005, 35 versus 9 minutes in spring 2006, 25 versus 6 minutes in fall 
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2006, and 24 versus 5 minutes in spring 2007), and having mentors model lessons (16 versus 
10 minutes in fall 2005, 14 versus 8 minutes in spring 2006, 12 versus 5 minutes in fall 2006, and 
10 versus 4 minutes in spring 2007). 1 Treatment teachers did not spend significantly more time than 
control teachers meeting together with mentors and other teachers or CO- teaching lessons with 
mentors, except in spring 2007 (16 versus 8 minutes and 8 versus 2 minutes, respectively). The total 
minutes per week spent by treatment and control teachers in the six types of activities covered in the 
survey was 170 versus 87 minutes in fall 2<X)5, 146 versus 77 minutes in spring 2006, 106 versus 
44 minutes in tall 2006, and 105 versus 36 minutes in spring 2007. 

After the comprehensive induction program ended, both treatment and control teachers 
reported spending less than 10 minutes per week in each of the six types of mentoring activities 
during the most recent full week of teaching. Dilfcrenccs between treatment and control teachers in 
total time spent in the six activities and in all individual activities were not statistically significant. 

e. Areas of Mentor Guidance 

During the comprehensive induction program, at least half of all treatment teachers reported 
receiving mentors’ assistance during the most recent full week of teaching in all 10 topic areas 
covered in the surv ey in each period, while roughly 20 to 75 percent of control teachers reported 
receiving assistance, figure IV. 18 illustrates dittcrences in levels of support for treatment and 
control teachers over time. The percentage of treatment teachers receiving each type ol assistance 
ranged from 66 to 92 percent in fall 2005, 70 to 92 percent in spring 2006, 50 to 72 percent in fall 

2006, and 58 to 78 percent in spring 2007, but then dropped after the comprehensive induction 
program ended to 8 to 13 percent in fall 2007, and 3 to 6 percent in tall 2008. Among control 
teachers, the percentage reporting each type of assistance ranged from 48 to 73 percent in fall 2005, 
44 to 70 percent in spring 2006, 21 to 30 percent in fall 2006, 19 to 38 percent in spring 2007, 8 to 
17 percent in fall 2007, and 7 to 13 percent in fall 2008. The area in which both treatment and 
control teachers received most guidance in each period was encouragement or moral support, and 
the area in which they received least guidance was on how to assess students, with one exception: 
treatment teachers in fall 2006 received the least help in the area of teaching to meet state or district 
standards. 

Mxamining differences in service receipt, during the comprehensive induction program, we 
found treatment teachers were significantly more likely than control teachers to report receipt of 
mentors’ assistance during the most recent lull week of teaching in all 10 topic areas covered in the 
survey. Differences ranged from 14 to 28 percentage points in fall 2005, 28 to 44 percentage points 
in spring 2(XI6, 21 to 31 percentage points in fall 2006, and 31 to 41 percentage points in spring 

2007. The following areas of guidance were among the top four in terms of impact size in three of 
the four periods: receiving suggestions to improve practice, having opportunities to raise issues or 
discuss concerns, receiving help identifying teaching challenges and solutions, and discussing 
instructional goals and ways to achieve them. 

After the comprehensive induction program ended, fewer than 20 percent of treatment or 
control teachers reported receiving each type of assistance during the last full week of teaching. 

" Implied difference* in the time spent in some activities may not match the differences shown in Figure IV. 17 
due to rounding. 
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Among treatment teachers, the percentage reporting each type of assistance ranged from 4 to 
13 percent, while among control teachers, the percentage reporting each type of assistance ranged 
from 8 to 17 percent. There were no significant differences between treatment and control in the 
receipt of mentors’ assistance in any of the topic areas covered in the survey with one exception 
across the 20 areas and periods: treatment teachers were significantly less likely than control teachers 
to share lesson plans, assignments, or other instructional activities (5 versus 13 percent) in fall 2008. 

Figure IV.18. Treatment -Control Differences in Percent of Teachers Who Received 10 Types of Mentor 
Guidance in the Last Full Week of Teaching: Two-Year Districts 



-10 - -SI* 

Fall 2005 Spring 2006 Fall 2CC6 Spring 2007 Fall 2007 Fall 2003 


Source: Mathematics First. Second. Third. Fifth, and Sixth Induction Activities Surveys administered in fall 2005. 

spring 2006. fall 2006. fall 2007. and fall 2008 to all study teachers and Fourth Induction Activities 
Survey administered in spring 2007 to study teachers in two-year districts. 

Note: Data pertain to teachers in two-year districts participating in the study. Solid (open) circles indicate that 

treatment-control differences are (not) statistically significant at the 0.05 level. In fall 2008. A is 
significant But F and J are not. N = 395 teachers in fall 2005. 386 teachers in spring 2006. 360 teachers 
in fall 2006. 372 teachers in spring 2007. 326 teachers in fall 2007. and 321 teachers in fall 2008. 

Legend: A = Suggestions to improve practice. B = Encouragement or moral support. C = Opportunity to raise 
issues/discuss concerns. D = Help with administrative/logistical issues. E = Help with teaching to meet state or district 
standards. F = Help identifying teaching challenges and solutions. G = Discussed instructional goals and ways to 
achieve them. H = Guidance on how to assess students. I = Shared lesson plans, assignments, or other instructional 
activities. J = Acted on a beginning teacher's request. 

f. Observations and Feedback 

During the comprehensive induction program, both treatment and control teachers reported 
receiving two types of observations and three types of feedback at least once during the three 
months prior to the survey. Treatment teachers received certain ty pes of observations and feedback 
significantly more frequently than control teachers. Figure IV. 19 illustrates differences in levels of 
support for treatment and control teachers over time. 

Examining differences in service receipt during the comprehensive induction program, 
treatment teachers were significantly more likely than control teachers to be observed by mentors 
(3.4 versus 2.1 times in fall 2005, 3.2 versus 1.6 times in spring 2006, 2.3 versus 0.8 times in fall 
2006, and 2.5 versus 1.0 times in spring 2007) and to receive informal feedback (2.5 versus 2.0 times 
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in spring 2006, 1.9 versus 1.5 limes in fall 2006, and 2.2 versus 1.5 times in spring 2007) during the 
comprehensive induction program, with one exception across these 8 activities and periods: 
treatment teachers were not significantly more likely than control teachers to receive informal 
feedback during fall 2005. Treatment and control teachers did not dilfer significantly in the 
frequency with which they were observed by their principals or the frequency with which they 
received feedback on lesson plans or as part of a formal evaluation during the intervention with one 
exception across these 12 activities and periods: treatment teachers were more likely than control 
teachers to receive feedback as part of a lormal evaluation in spring 2007 (1.6 versus 1.3 times). 

Figure IV.19. Treatment-Control Differences in the Frequency of Selected Activities During Past Three 
Months: Two-Year Districts 


- 



c 


-a 

Fall 2005 spring 2006 Fall 2CC6 Spring 2007 Fall 2007 Fall 2008 

Source: Mathematca First. Second. Third. Fifth, and Sixth Induction Activities Surveys administered in fal 2005. 

spring 2006. fall 2006. fall 2007. and fall 2008 to all study teachers and Fourth Induction Activities 
Survey administered in spring 2007 to study teachers in two-year districts. 

Note: Data pertain to teachers in two-year districts participating in the study. Solid (open) circles indicate that 

treatment-control differences are (not) statisticaly significant at the 0.05 level. N = 395 teachers in fall 
2005. 386 teachers in spring 2006. 360 teachers in fall 2006. 372 teachers in spring 2007. 326 teachers 
in fall 2007. and 321 teachers in fall 2008. 

Legend: A = Teaching was observed by mentor. 8 = Teaching was observed by principal. C = Given feedback on 
your teaching, not as part of formal evaluation. D = Given feedback on your teaching as part of formal evaluation. 
E = Given feedback on your lesson plans. 

After the intervention, the Irequencv of the two types of observations during the three months 
prior to the survey ranged from 0 to 2 times among treatment teachers and fi to 1 time among 
control teachers, while the frequency of the three types of feedback ranged from 1 to 2 times for 
both treatment and control teachers. Treatment and control teachers did not differ significantly in 
the frequency with which they engaged in these activities with one exception across the 10 activities 
and periods: treatment teachers were significantly more likely than control teachers to receive 
informal feedback in fall 2008 (1.5 versus 1.0 times). 
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g. Professional Development Activities 

During the comprehensive induction program, both treatment and control teachers reported 
participating in nine professional development activities during the three months prior to the survey. 
Figure IV. 20 illustrates differences in levels of support for treatment and control teachers over time. 
Among treatment teachers, participation rates across the 36 activities and periods ranged trom 33 to 
86 percent and did not reach 100 percent for any activity. The most commonly reported activity 
among treatment teachers in each period was keeping a portfolio and analysis of student work, while 
the least commonly reported activity was keeping a written log. Among control teachers, these 
participation rates ranged from 14 to 84 percent, with the most commonly reported activity in each 
period being keeping a portfolio and analysis of student work and the least commonly reported 
activity in each period being working with a study group of new teachers. 

Treatment teachers were more likely than control teachers to report participating in certain of 
these activities during the comprehensive induction program. Treatment teachers were significantly 
more likely to work with a study group of new teachers (67 versus 24 percent in fall 2005, 64 versus 
25 percent in spring 2006, 42 versus 19 percent in fall 2006, and 48 versus 14 percent in spring 2007) 
or a study group of new and experienced teachers (48 versus 34 percent in spring 2006, 54 versus 
40 percent in fall 2006, and 51 versus 36 percent in spring 2007), with one exception across these 
8 activities and periods: treatment teachers were no more likely than control teachers to work with a 
group of new anti experienced teachers in fall 2005. Treatment teachers were also significantly more 
likely than control teachers to observe others teaching in their classrooms in spring 2006 anti spring 
2007 (72 versus 47 percent and 47 versus 35 percent, respectively) or teaching the beginning 
teacher's class in spring 2006 (48 versus .36 percent). Treatment and control teachers did not differ 
significantly in their participation in the other five activities covered in the survey (keeping a written 
log; keeping a portfolio and analysis of student work; meeting with a principal; or meeting with a 
literacy or mathematics coach, other curricular specialist, or resource specialist) in any period with 
one exception across these 20 activities and periods: treatment teachers were more likely than 
control teachers to keep a written log in spring 2006 (42 versus 26 percent). 

After the comprehensive induction program ended, participation ranged widely from 12 to 
84 percent among treatment teachers and from 12 to 85 percent among control teachers across the 
18 activities and periods. In contrast to the differences in sendee receipt during the comprehensive 
induction program, there were no significant differences between treatment and control teachers in 
any of the professional development activities covered in the survey during these periods. 
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Figure IV.20. Treatment-Control Differences in Percent of Teachers Who Completed Nine Types of PD 
Activities During the Past Three Months: Two-Year Districts 
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Source: Mathematca First. Second. Third. Fifth, and Sixth Induction Activities Surveys administered in fal 2005. 

spring 2006. fall 2006. fall 2007. and fall 2008 to all study teachers and Fourth Induction Activities 
Survey administered in spring 2007 to study teachers in two-year districts. 

Note: Data pertain to teachers in two-year districts participating in the study. Solid (open) circles indicate that 

treatment-control differences are (not) statistically significant at the 0.05 level. N = 395 teachers in fall 
2005. 386 teachers in spring 2006. 360 teachers in fall 2006. 372 teachers in spring 2007. 326 teachers 
r\ fall 2007. and 321 teachers in fall 2008. 

Legend: A = Kept written log. B = Kept portfolio and analysis of student work. C = Worked with study group of new 
teachers. D = Worked with study group of new and experienced teachers. E = Observed others teaching in their 
classrooms. F = Observed others teaching your class. G = Met with principal to discuss teaching. H = Met with 
literacy or mathematics coach or other curricular specialist. I = Met with a resource specialist to discuss needs of 
particular students. 

h. Professional Development Topic Areas 

During the comprehensive induction program, the reported attendance of treatment teachers at 
professional development sessions in 12 topic areas during the three months prior to the survey 
tanged widely from 24 to 80 percent and did not reach 100 percent for any topic area. The range of 
reported attendance among control teachers was 18 to 79 percent. Figure IV.21 illustrates 
differences in levels of support for treatment and control teachers over time. For both treatment 
and control teachers in all four periods, the most attended topic sessions were on instructional 
techniques. The least attended sessions included understanding the composition of students in the 
teacher’s class in three of the four periods for treatment teachers and in two of the tour periods for 
control teachers. 

Treatment teachers were significantly more likely than control teachers to attend professional 
development activities in certain topic areas during the three months prior to the survey while the 
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comprehensive induction program was being implemented. These areas included untierstanding the 
composition of students in your class in spring 2006 (32 versus 22 percent), content area knowledge 
in spring 2006 (70 versus 60 percent), lesson planning in spring 2006 and spring 2007 (43 versus 
31 percent and 38 versus 28 percent, respectively), analyzing student work and assessment in spring 
2006 anti spring 2007 (60 versus 41 percent and 57 versus 45 percent, respectively), student 
motivation anti engagement in spring 2007 (43 versus 24 percent), differentiated instruction in 
spring 2006 and spring 2007 (62 versus 47 percent and 58 versus 43 percent, respectively), anti 
classroom management techniques in fall 2005 and spring 2006 (61 versus 48 percent anti 53 versus 
34 percent, respectively). Treatment teachers were not significantly more likely than control teachers 
to attend PD on parent and community relations, school policies on student disciplinary procedures, 
instructional techniques and strategies, using computers to support instruction, or preparing 
students for standardized testing during the intervention (20 areas and periods). The lack of 
significant differences between treatment and control teachers during the fall 2005 and fall 20(H) 
surveys is not due to the recall period including part of the summer vacation for some teachers since 
treatment and control teachers completed their surveys at the same time on average. 

After the comprehensive induction program ended, participation rates remained variable among 
topic area sessions for both treatment and control teachers. Participation rates ranged from 15 to 
78 percent among treatment teachers and from 15 to 78 percent among control teachers across the 
24 areas and periods. For both treatment and control teachers in fall 2007 and fall 2008, the most 
attended topic area sessions were on instructional techniques while the least attended sessions were 
on parent and community relations. Treatment teachers were not significantly more likely than 
control teachers to attend PD during this period, with two exceptions across the 24 areas and 
periods: treatment teachers were more likely to attend PD on instructional techniques and strategies 
in fall 2007 (78 versus 64 percent) and in analyzing student work and assessment in fall 2008 
(55 versus 39 percent). 


83 



IV. Program ImpirmtHtation 


Figure IV.21. Treatment-Control Differences in Percent of Teachers Who Attended PD in 12 Topic Areas 
During the Past Three Months: Two-Year Districts 
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Source: Mathematica First. Second. Third. Fifth, and Sixth Induction Activities Surveys administered in fal 2005. 

spring 2006. fall 2006. fall 2007. and fall 2008 to all study teachers and Fourth Induction Activities 
Survey administered in spring 2007 to study teachers in two-year districts. 

Note: Data pertain to teachers in two-year districts participating in the study. Solid (open) circles indicate that 

treatment-control differences are (not) statisticaBy significant at the 0.05 level. N - 395 teachers in fall 
2005. 386 teachers in spring 2006. 360 teachers in fall 2006. 372 teachers in spring 2007. 326 teachers 
in fall 2007. and 321 teachers in fall 2008. 

Legend: A = Parent and community relations. B = School policies on student disciplinary procedures. 
C = Instructional techniques/strategies. D = Understanding the composition of students in your class. E = Content 
area knowledge (language arts, mathematics, science). F = Lesson planning. G = Analyzing student 
work'assessment. H = Student motrvatiov’engagement. I = Differentiated instruction. J = Using computers to support 
instruction. K = Classroom management techniques. L = Preparing students for standardized testing. 
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V. IMPACT FINDINGS: CLASSROOM OUTCOMES 

In this chapter, we present the estimated impacts of comprehensive induction on classroom 
outcomes. We hypothesize that treatment improves teachers’ classroom practices and ultimately 
their students’ achievement as reflected in standardized test scores. First, we examine the cllcct on 
teachers’ classroom practices in the teaching of a literacy lesson in their first year of teaching, 
drawing on the results that were reported in the study’s first interim report (Cilazerman et al. 2008). 
Then, we test whether the students of teachers who received comprehensive induction performed 
better on standardized tests than did students of teachers who received the usual induction services 
(the control group). For both sets of outcomes, we present a summary of methods, findings, and 
sensitivity tests. For the classroom practices analysis, we focused on those teachers responsible for 
Iinglish language arts or literacy classes (698 teachers). For test score analyses, we focused on 
teachers in tested grades and subjects (about 200 teachers per year). Results pertaining to literacy 
instruction do not necessarily apply to teachers of other subjects. Similarly, results for teachers in 
tested grades do not necessarily apply to teachers of other grades or subjects. Readers may refer to 
Appendix A for a detailed description of analytic methods and to Appendix C for supplementary 
tables. 

A. Classroom Practices 

The conceptual framework presented in < Chapter I suggests that for teacher induction to 
improve student achievement, it must first change the way teachers teach. To test for changes in 
teacher practices, we sent trained observers into treatment and control classrooms to administer the 
Diagnostic Classroom Observation (DCO), described in Chapter III and Appendix A. The DCO, 
formerly known as the Vermont Classroom Observation Tool, measures three domains of teaching: 
lesson implementation, lesson content, and classroom culture. It captures the degree to which the 
observed lesson reflects evidence of what arc believed to be effective practices. We observed literacy 
lessons (or reading/language arts lessons) in more than 600 classrooms in spring 2006 (y ear 1 of the 
study). By design, we did not repeat the observations in later years. Teachers who were teaching 
special populations, were teaching subjects other than rcading/Mnglish language arts, were net longer 
teaching, or had prior teaching experience were not observed (by design) and therefore not included 
in this component of the impact analysis. See Chapter III for a more detailed discussion of the 
sample exclusions. 

We estimated impacts on classroom practices using the regression methods described in 
Appendix A. Because observations were conducted only in the first year of the study, we combined 
results from one-year and two-year districts. As discussed in Chapter III, observers scored teachers 
on a five-point scale in each of the three domains based on a set of 16 items believed to be 
indicators of effective practice. The three domains cover five, four, and seven of the indicators, 
respectively. The full set of 16 indicators is shown by domain in Appendix C (Table C.l), and 
covariates in the model arc listed in Table A.l. 

To summarize the information from the classroom observations across all 16 indicators, we 
produced three scores corresponding to the three domains captured by the observation protocol 
(into which the items had already been grouped). The benchmark estimates use the average score of 
the indicators within each domain and thus assume that the intervals between each category arc 
equal. For example, the difference between "no evidence" and “limited evidence" is the same as the 
difference between “moderate evidence" and “consistent evidence.” It also assigns equal weight to 
the indicators within each domain. In other words, we assume that a score of 3 on two indicators of 
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classroom culture — for example, “Classroom routines arc clear and consistent” and "Behavior is 
respectful and appropriate” — is equivalent to a score of 4 on one of those indicators anti a 2 on the 
other. I listograms for treatment and control teachers’ performance in each of the three domains arc 
included in Appendix C (Figures C.l— C.3). These histograms illustrate the pattern of variation (or 
distribution) of the classroom practices data. We also present impact estimates for literacy 
implementation scores separately by district in Figures C.4— C.5. 

No Impact on Classroom Practices. There was no statistically significant impact of treatment 
on classroom practices for teachers in one-year and two-year districts combined {Table V.l). After 
controlling for important teacher and school characteristics, we observed no statistically significant 
differences between treatment and control teachers' performance on the implementation of a 
literacy lesson, content of a literacy lesson, or classroom culture. We express the impact on each 
domain of classroom practice as the difference in scores on the five-point scale. An impact of 
0.5 points, for example, would suggest that the intervention moves the average teacher from being 
able to demonstrate "moderate" evidence of a particular practice in that domain half of the distance 
to being able to demonstrate “consistent” evidence of that practice. (The observed estimates of the 
impacts were smaller than the 0.5 points used in this example.) 


Table V.l. Impacts on Classroom Practices (Average Score on a Five-Point Scale): One-Year Districts and 
Two-Year Districts Combined, 2005-2006 School Year 


Outcome 

Treatment 

Control 

Difference 

Effect 

Size 

P value 

Implementation of literacy lesson 

2.7 

2.6 

0.0 

0.02 

0.766 

Content of literacy lesson 

2.4 

2.4 

0.0 

-0.01 

0.875 

Classroom culture 

3.1 

3.0 

0.0 

0.04 

0.629 

Sample Size (Teachers) 

342 

289 





Source: Mathematica Teacher Background Survey administered in fall 2005 to all study teachers: Mathematics 

classroom observations conducted in spring 2006 . 

Note: Data are weighted and regression adjusted using ordinary least squares to account for differences in 

baseline characteristics and the study design. Scoring scale: (1) no evidence. (2) limited evidence. (3) 
moderate evidence. (4) consistent evidence, or (5) extensive evidence of effective teaching practice. 

None of the differences is statistically significant at the 0.05 level. 


Findings Were Robust. Wc rccstimatcd the impacts using a variety of assumptions about item 
scoring and estimation and found that the results did not change substantially. The results were not 
sensitive to how wc grouped the individual items into constructs, nor did they change when wc 
collapsed the scale. Wc estimated the model separately for each classroom observation item after 
rcctxling each score from a five-point scale into a binary variable: (1) no, limited, or moderate 
evidence or (2) consistent or extensive evidence of good practice. The results support the same 
conclusions of no impact (sec Table C.l in Appendix C). The results were also not sensitive to the 
choice of summary score; when wc substituted the observ er summary scores for the computed 
average scores, wc reached the same conclusions of no impact (sec Table C.2). Finally, when wc 
estimated the model separately for one-year and two-year districts (sec Appendix C, Tables C.3 and 
C.4), wc found that the impact estimates were not significantly different from zero. 
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B. Student Achievement 

\\ c compared the test scores for students of treatment teachers to those of control teachers, 
adjusted for pretest scores and background characteristics of students and teachers. Although 
district-administered test scores do not cover even 1 domain of student achievement that induction 
might affect, they do capture the content that school districts or states deem important and worthy 
of assessing. 

We present results for the teachers’ third year, the 2007-2008 school year. Results for the first 
two years show no overall impacts in either subject, as documented in earlier reports ((ila/erman ct 
al. 2<X)8; Iscnbcrg ct al. 2009). Although comprehensive teacher induction services ended after the 
2005—2006 school year in one-year districts and after the 2006—2007 school year in two-year 
districts, there can be delayed impacts of induction programs because teachers may not be able to 
implement the advice they have been given immediately (Iscnbcrg ct al. 2010). The teachers' third 
year was two years following the end of the intervention in one-year districts, and one year following 
the end of the intervention in two-year districts. 

For year 3, we found no evidence of a positive impact on test scores in either subject in one- 
year districts, but evidence of a positive and significant impact on both subjects in two-year districts. 
We checked the findings using different methods of aggregation, model specification, and model 
estimation, and we report on both the benchmark model and alternative specifications. The no- 
impact results for reading in onc-vcar districts arc robust to these sensitivity analyses, but impacts 
for math may be either statistically insignificant or negative and significant if we make alternative 
assumptions about the statistical model or data processing. For two-year districts, results for reading 
and math arc not robust to all specifications and samples, as they may be either positive and 
significant or statistically insignificant, depending on the model. 

Estimating impacts on student achievement required the use of test score data from 15 districts, 
which administered different tests under different conditions and followed different record-keeping 
practices. Although 10 one-year districts participated in the study, one of these districts was unable 
to match teachers in the study with student test scores. A second district declined to share its data in 
the third year of the study. All 7 two-year districts participated in all years of the study, but in the 
third year of the study, as a result of attrition, one district had no grades in which there was at least 
one treatment teacher and one control teacher who could be compared to each other. 

We aggregated test scores across districts and grades by standardizing each test to a common 
metric called a z-scorc, which has a mean of zero and a standard deviation of one. The benchmark 
model accounts for the nesting of students within schools- As shown in Table A.l, the covariates in 
the benchmark model are (1) the normalized student pretest score (interacted with district-by-grade 
fixed effects so that each test has a different effect); (2) student characteristics; (3) teacher personal 
characteristics; (4) teacher professional characteristics; anti (5) district-by-grade fixed effects. 
Appendix A describes in more detail the aggregation method, treatment of missing data, regression 
model, and estimation strategies. 

1. One- Year Districts: No Impacts on Math or Reading 

For one-year districts, the benchmark estimates of the impacts on math and reading scores were 
not significantly different from zero (see Table Y.2). 
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Table V.2. Impacts on Test Scores: One-Year Districts. 2007-2008 School Year 



Adjusted Mean 

Test Scores 

Difference 



Sample Sizes 


Subject 

Treatment 

Control 

itttect 

Size) 

P-value 

Students 

Teachers 

Districts 

Reading 

•0.24 

0.24 

0.01 

0.801 

1,690 

99 

8 

Math 

•0.17 

0.07 

•0.10 

0.108 

1,629 

95 

8 


Source: Mathematica analysis using data from the 2006-2007 and 2007-2008 school years provided 

by participating school districts; Mathematica Teacher Background Survey administered in fall 
200 S to all study teachers. 

Note. Data pertain to teachers in one-year districts participating in the study. Data are regression 

adjusted to account for pretest, student and teacher characteristics, district-by-grade fixed 
effects, and clustering of students within schools. Treatment and control group sample sizes 
are shown in Appendix Table C.5. 

None of the differences is statistically significant at the 0.05 level. 

Wc performed a sensitivity analysts by rcestimating the impacts using different samples, sets of 
covariates, and estimation techniques: 

• Disaggregating results by grade 

• Removing teacher characteristics as control variables, so that only pretest, student 
background characteristics, and district-bv-grade fixed effects remain (sec Table A.l in 
Appendix A for a list of control variables) 

• Removing teacher and student characteristics as control variables so that only pretest and 
district-by-gradc fixed effects remain (sec Table A.l in Appendix A for a list of control 
variables) 

• Estimating impacts using a set of weights so that each school receives an equal weight in 
the analysis 

• Estimating impacts using a set of weights so that each district receives an equal weight in 
the analysis 

• l sing specific information on teaching assignments gathered during telephone followup 
to surveys, to determine eligibility for analysis 

• Including all students with a pretest and posttest, without imposing restrictions on the 
teachers and students included in the sample 

• Estimating impacts without controlling for a pretest, using the same sample as the 
benchmark model 

• Estimating impacts without controlling for a pretest, expanding the sample to include 
students with a posttest only 

• Estimating impacts in a model that compares treatment and control teachers to each 
other within a district rather than within a district-grade combination 

• Estimating impacts using the opposite-subject pretest as an instrumental variable to 
remove the influence of measurement error in the pretest 
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Grade-specific estimates are useful in that they can illustrate heterogeneity of impacts, and they 
do not require the assumption that increments of different types of learning he on the same scale. 
Keep in mind, however, that the study was not designed to detect significant impacts at the grade 
level. Vi e also present estimates of the impacts separately by district in f igures C.6— ('..9 in Appendix 
0. Details of the method that generated these results can be found in Appendix A. 

Results for Reading in One-Year Districts Are Robust. When the reading results for one- 
year districts arc disaggregated by grade, the impact estimate for grade 5 is significantly positive and 
the estimates arc not significant for grades 2, 3, and 4. Results arc shown in the top panel of Table 
V.3. 


Table V.3. Impacts on Test Scores by Grade: One-Year Districts, 2007-2008 School Year 



Adjusted Mean 

Test Scores 

_ niffnrnnrn 



Sample Sizes 


Subject/Grade 

Treatment 

Control 

uinerence 

(Effect 

Size) 

P-value 

Students 

Teacher 

s 

Districts 

Reading 

Grade 2 

-0.04 

0.03 

0.07 

0.1 IS 

444 

25 

4 

Grade 3 

•0.14 

0.10 

0.04 

0.650 

421 

27 

S 

Grade 4 

0.40 

0.41 

0.01 

0.901 

479 

32 

7 

Grade S 

0.37 

•0.S5 

0.18' 

0.024 

346 

19 

4 

All Grades 

•0.24 

0.24 

0.01 

0.801 

1,690 

99 

8 

Math 

Grade 2 

0.31 

0.45 

-0.13 

0.299 

200 

13 

2 

Grade 3 

•0.10 

0.2 S 

•0.35’ 

0.002 

422 

27 

S 

Grade 4 

0.29 

0.21 

0.07 

0.429 

661 

40 

8 

Grade 5 

0.30 

0.46 

0.16 

0.143 

346 

19 

4 

All Grades 

•0.17 

0.07 

•0.10 

0.108 

1.629 

95 

8 


Source: Mathematics analysis us»tg data from the 2006-2007 and 2007-2008 school years provided by participating 

school districts: Mathematics Teacher Background Survey administered In tall 2005 to all study teachers. 

Note: Data pertam to teachers in one-year districts participating n the study. Data are regression adjusted to 

account tor pretest, student and teacher characteristics, distnet-by-grade fixed ettects. and dustenng ot 
students within schools. Treatment and control group sample sizes are shown Appendix Table C.6. 

•Significantly different from zero at the 0.05 level. 

A set of additional specification checks, shown in the top panel of Table V.4, confirmed that 
there was no statistically significant effect of treatment on reading in the third year of teaching. The 
top row of Table V.4 repeats the results of the benchmark analysis for reference. The second and 
third rows arc estimated with fewer covariatcs than in the benchmark model. For the results shown 
in the second row, teacher background characteristics have been excluded so that we control for 
only pretest, student background characteristics, and district-grade fixed effects. In the third row, we 
exclude both teacher and student background characteristics so that wc control only for pretest and 
district-grade fixed effects. In both cases, the impact estimates, consistent with the benchmark, are 
not significantly different from zero. Sec Appendix Table A. 1 for a list of student and teacher 
covariatcs used in these models. 
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Table V.4. Impacts on Test Scores, Alternate Model Specifications: One Year Districts, 2007-2008 
School Year 



Adjusted Mean 

Test Scores 

Difference 

(Effect 

Size) 



Sample Sizes 


Subject/Model 

Treatment 

Control 

P-value 

Students 

Teachers 

Districts 

Reading 








(1) Benchmark 

•0.24 

•024 

0.01 

0.801 

1,690 

99 

8 

(2) No teacher covariates 

•023 

•025 

0.02 

0.638 

1,690 

99 

8 

(3) No teacher or student 
covariates 

•0.23 

•025 

0.02 

0.669 

1,690 

99 

8 

<4) Schools weighted 
equally 

•0.25 

•024 

•0.01 

0.768 

1,690 

99 

8 

<5) Districts weighted 
equally 

•0.23 

•0.24 

0.01 

0.764 

1,690 

99 

8 

(6) Using specific 
information on teacher 
assignments 

•0.2 S 

•0.24 

0.00 

0.970 

1,540 

89 

8 

(7) Without imposing 
data restrictions 

•0.26 

•0.28 

0.02 

0.600 

1,892 

107 

8 

(8) No pretest, 
benchmark sample 

•0.18 

•0.29 

0.10 

0.108 

1,690 

99 

8 

(9) No pretest, expanded 
sample 

•0.17 

•0.27 

0.10 

0.085 

2,856 

151 

8 

(1 0> Compare teachers 
within districts, not 
district-grades 

•0.27 

-0.27 

0.01 

0.880 

1,973 

1 14 

8 

(1 1> Instrumental 
variables 

Math 

•0.27 

•0.24 

•0.03 

0.445 

1,519 

93 

8 

(1) Benchmark 

•0.17 

•0.07 

•0.10 

0.108 

1,629 

95 

8 

(2) No teacher covariates 

•0.17 

•0.06 

•0.1 1 * 

0.044 

1,629 

95 

8 

(3) No teacher or student 
covariates 

-0.18 

•0.06 

•0.1 2* 

0.038 

1,629 

95 

8 

(4) Schools weighted 
equally 

<S) Districts weighted 
equally 

•0.20 

•0.07 

-0.1 3* 

0.000 

1,629 

95 

8 

•0.18 

•0.08 

•0.10 

0.092 

1,629 

95 

8 

46) Using specific 
information on teacher 
assignments 

•0.17 

•0.03 

•0.1 4‘ 

0.037 

1,487 

87 

8 

(7) Without imposing 
data restrictions 

•020 

-0.08 

•0.12* 

0.042 

1,700 

97 

8 

(8) No pretest, 
benchmark sample 

•0.10 

•0.13 

0.03 

0.749 

1,629 

95 

8 

(9) No pretest, expanded 
sample 

•0.09 

•0.16 

0.08 

0290 

2,702 

138 

8 

(1 0) Compare teachers 
within districts, not 
district-grades 

•020 

-0.12 

•0.07 

0229 

1,804 

104 

8 

(1 1) Instrumental 
variables 

•0.15 

•0.04 

•0.1 1 

0.091 

1,554 

93 

8 


Source: Mathematica analysis using data from the 2006-2007 and 2007-2008 school years provided by 

participating school districts: Mathematica Teacher Background Survey administered in fall 2005 to 
all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Data are regression 

adjusted to account for clustering of students within schools. See Appendix Table A.1 for a list of 
student and teacher covariates used in these models. Treatment and control group sample sizes are 
shown in Appendix Table C.7. 

‘Significantly different from zero at the 0.05 level. 
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The fourth and fifth rows of Table V.4 show results from models that use the same sample as 
the benchmark model but apply a special set of weights to the students in the model. In the 
benchmark model, each student in the model receives an equal weight, implying that schtxils or 
districts with a greater number of students receive more weight in the overall results. To check how 
well the results generalize, for the fourth row, the student observations arc weighted in such a way 
that each school in the model receives an equal weight. In the fifth row, the student observations arc 
weighted in such a way that each district in the model receives an equal weight. Results Irom both of 
these specifications arc consistent with the benchmark. 

We confirmed that there was no impact on reading when we modified the rules for including or 
excluding teachers from the sample. We had to make several decisions about which teachers were 
correctly linked to students and exclude those believed to be incorrectly linked. As explained in 
Appendix A, we attempted to follow up by telephone with teachers whom we suspected should 
have been excluded from the analysis because, according to the data we received from the school 
district, they taught too many or too few students to be plausibly eligible for the analysis as a regular 
classroom teacher. We contacted teachers who met certain criteria and asked them how many 
students they taught in reading and math in the 2006-2007 and 2007-2008 school years. Based on 
the information we received, we constructed a set of general eligibility rules. Because treatment 
teachers were more likely to confirm their teaching assignments than control teachers, we applied 
the four restriction mles to all teachers in the benchmark analysis, even it they were an exception to 
these general rules. In this way, the application of the restrictions was not confounded with 
treatment status. As a check on our results, however, we estimated treatment effects based on a 
sample that used general rules for teachers whom we were unable to contact and particular 
information Irom teachers' responses if it contradicted the general rules. 1 The results arc shown in 
the sixth row of Table V.4. As a further check, shown in the seventh row of Table V.4, we estimated 
the model based on a sample that did not impose any exclusion mles. The estimated impact of 
treatment in both of these models remained statistically insignificant. 

Another set of models change the statistical model, thereby expanding the sample. We excluded 
from the benchmark analysis any student with missing pretest scores. Because of this, we may have 
excluded from the analysis mobile students and students who were in the lowest grade tested in the 
district, often third grade. The eighth row shows results from estimating impacts without controlling 
for a pretest but maintaining the same sample as the benchmark model. The ninth row shows results 
that do not control for pretest and expands the sample by including all students with a valid posttest 
score, including those who lack a pretest score. In both eases, the impact is not significantly different 
from zero. 

The tenth row shows results from a model that uses district fixed effects (rather than district- 
grade fixed effects). Because comparisons between treatment and control teachers arc made at the 
district level rather than the district-grade level, this model does not require that there be 
treatment/ control overlap in each district-grade combination for teachers to be included. This is 


1 For Hiding for one-year district*, the additional information resulted in our excluding 10 teachers who were 
included when applying the general rules. For math for one-year districts, we excluded 8 teacher* who had been 
included. For reading for two year district* the additional information did not change the sample. For math for two year 
district*, we excluded fewer than three teachers. 
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because we arc now estimating each teacher's effectiveness by implicitly comparing each teacher to 
all teachers in her or his state and then comparing the effectiveness of treatment teachers to control 
teachers for all grades within a district. W ith this estimation strategy, we can include a larger sample 
of teachers. This model also shows no significant treatment cflcct. 


The eleventh row shows results from a regression model in which the math pretest is used as an 
instrumental variable to adjust for measurement error in the reading pretest. This also decreases the 
sample since students who lack a math pretest arc excluded. In nonexpcrimental settings, if the 
students of teachers in the treatment and control groups arc different in ways not easily observable 
to the researcher, this estimation strategy can correct bias in the estimates. Although we have 
conducted an experiment, these results are included to account for the possibility that principals may 
have assigned students to treatment and control teachers differently in the third year of the study 
than they did in the first year. For example, if principals believed that comprehensive teacher 
induction gave teachers a better ability to cope with disruptive students, they may have been more 
willing than usual to place potentially disruptive students in those teachers’ classrooms in subsequent 
years. Using the instrumental variables model, however, did not change our findings. 

Possible Negative Impact on Math in One-Year Districts. While the estimated math 
impact in the benchmark model was not statistically significant, the detailed analysis showed 
conditions under which the negative impact estimate was statistically significant. For the math 
results, the bottom panel of Table V.3 shows that gradc-by-gradc impacts arc negative and 
statistically significant for grade 3 anti not significantly different from zero for other grades. 


Using all grades for math tests, the bottom panel of Table V.4 shows findings of no impact 
when we weight districts equally (line 5), exclude the pretest from the model (lines 8-9), use a model 
based on district fixed effects (line 10), or use an instrumental variables approach (line 1 1). Results 
are negative and significant when we include only student covariatcs (line 2), include only a single 
pretest measure plus district-grade fixed effects as covariates (line 3), weight schools equally (lines 4), 
use specific information from follow-up phone calls to teachers (line 6), or do not impose data 
restrictions on the sample (line 7). 


2. Two-Year Districts: Positive Impact on Math and Reading for Benchmark Sample 


For two-year districts, the benchmark estimates of the impacts on reading and math scores were 
positive and significant for the third year, one year following the end of the intervention in these 
districts (see Table V.5). The estimates suggest that assignment to two years of comprehensive 
teacher induction instead of a district's usual induction services increases student reading scores by 
1 1 percent of a standard deviation and increases math scores by 20 percent of a standard deviation, 
rhese impacts arc the equivalent of moving the average student from the 50th percentile up 
4 percentile points in reading and 8 percentile points in math. As we show in the sensitivity analyses, 
however, if we reesrimate the impacts without requiring test scores from the prior year, we do not 
find an impact on math or reading scores. This alternative approach nearly doubles the available 
sample of study teachers but the lack of data on students’ prior achievement results in a less precise 
estimate. This means that we arc less likely to detect a true impact if it exists, despite the larger 
sample size. 
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Impact Estimates for Reading in Two-Year Districts Not Robust to All Specifications 
and Samples. The sensitivity analysis for the reading test score impacts in two-vear districts showed 
that the statistical significance of the result required pooling all grades and including covariatcs. 
Underlying the positive and significant overall impact on reading was a positive and significant cflcct 
in grade 3 and impact estimates for grades 2, 4, 5, and 6 that were not statistically significant, as 
shown in the top panel of Table V.6. 


The top panel of Table V.7 shows mixed results when we apply alternate model specifications. 
The impact estimates remain positive and significant when we weight districts equally (line 5), 


Table V.5. Impacts on Test Scores: Two Year Districts, 2007-2008 School Year 


Adjusted Mean 

Test Scores Sample Sizes 





Difference 

P 




Subject 

Treatment 

Control 

(Effect Size) 

value 

Students 

Teachers 

Districts 

Reading 




0.018 

1,34? 

74 

6 

Math 

0.06 

0.26 

0.20’ 

0.000 

1,198 

68 

6 


Source: Mathematica analysis using data from the 2006-2007 and 2007-2008 school years provided 

by participating school districts; Mathematica Teacher Background Survey administered in fall 
2005 to all study teachers. 

Note: Data pertain to teachers in two-year districts participating in the study, Data are regression 

adjusted to account for pretest, student and teacher characteristics, district-by-grade fixed 
effects, and clustering of students within schools. Treatment and control group sample sizes 
are shown in Appendix Table C8. 

■Significantly different from zero at the 0.0S level. 


Table V.6. Impacts on Test Scores by Grade: Two Year Districts. 2007-2008 School Year 


Adjusted Mean 

Test Scores Difference Sample Sizes 

(Effect 


Subject/Grade 

Treatment 

Control 

Size) 

P-value 

Students 

Teachers 

Districts 

Reading 

Grade 2 

0.26 

0.29 

0.03 

0.723 

63 

5 

1 

Grade 3 

-0.21 

0.50 

0.29’ 

0.000 

261 

18 

2 

Grade 4 

-0.03 

0.07 

0.04 

0.595 

741 

39 

5 

Grade 5 

4)24 

0.44 

0.19 

0.098 

246 

12 

2 

Grade 6 

•120 

•1.02 

•0.18 

0.339 

36 

3 

1 

All Grades 

-0.14 

0.25 

0.1 1* 

0.018 

1,347 

74 

6 

Math 

Grade 2 

023 

0.37 

0.14 

0.092 

63 

5 

1 

Grade 3 

■0.14 

0.41 

0.27’ 

0.000 

279 

19 

2 

Grade 4 

0.12 

0.05 

0.17* 

0.033 

614 

35 

5 

Grade 5 

•0.33 

0.46 

0.13 

0.090 

206 

1 1 

2 

Grade 6 

-0.74 

•1.25 

0.51 

0.078 

36 

3 

1 

All Grades 

0.06 

0.26 

0.20’ 

0.000 

1,198 

68 

6 


Source: Mathematica analysis using data from the 2006-2007 and 2007-2008 school years provided 


by participating school districts; Mathematica Teacher Background Survey administered in fall 
2005 to all study teachers. 

Note: Data pertain to teachers in two-year districts participating in the study, Data are regression 

adjusted to account for pretest, student and teacher characteristics, district-by-grade fixed 
effects, and clustering of students within schools. Treatment and control group sample sizes 
are shown in Appendix Table C9. 

■Significantly different from zero at the 0.05 level. 
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incorporate specific callback information (line 6), do not impose data restrictions on the sample (line 
7), use a model with district fixed effects (line 10), and use an instrumental variables approach to 
remove the influence of measurement error (line 1 1).* 1 On the other hand, \vc find results that arc 
not statistically different from zero when we change the covariates used in the model, either by 
omitting teacher and/or student background characteristics (lines 2 and 3) or excluding the pretest 
(lines 8-9). The results are also statistically insignificant when we change the method used to estimate 
the standard errors of the model. 4 ' W hen we weight schools equally (line 4) the p-value rises above 
the 0.05 significance threshold to 0.052 

Impact Estimate for Math in Two-Year Districts Not Robust to All Specifications and 
Samples. The conclusion that treatment had a statistically significant positive impact on math did 
not change when we conducted sensitivity analysis except in some models where student covariates 
like pretest were ignored. Underlying the benchmark estimate in two-vear districts arc positive and 
significant impacts for grades 3 and 4. Results for grades 2, 5, and 6 are statistically insignificant, but 
the estimates for grades 2 and 6 were each derived using data from one district. See the bottom 
panel of Table V.6. 

The bottom panel of Table V.7 shows that the initial finding of a positive anti significant impact 
is robust to 7 of the 10 changes in covariates or specification that we present. The alternate 
specifications show a positive and significant effect of treatment except when we include only a 
single pretest measure plus district-grade fixed effects as covariatcs (line 3) or exclude the pretest 
from the model (lines 8-9). 

3. Understanding the Year 3 Findings on Test Score Impacts in the Two-Year Districts 

Positive and significant results emerge in two-year districts only in the third year of the study 
and depend on the definition of the analysis sample. In order to better understand this pattern of 
findings, we conducted two additional exploratory analyses: (1) we examined test score impacts for 
teachers who remained in the analysis sample in more than one study year; (2) we explored whether 
there arc positive impacts on intcimediary variables of interest for the benchmark sample of 
~4 teachers for reading anti 68 teachers for math. 


41 For reading estimates for two year districts, the sample dxl not change when we incorporated specific callback 
information (line 6), as none of the teachers contacted during the follow up phase indicated a different teaching 
assignment from what we hail assumed. 

42 As explained in Appendix A, there are multiple methods with which to estimate the standard errors of 
hierarchical mixlels. The benchmark student achievement models use ordinary least squares with Huber White robust 
Standard errors. We use this method because we collect data from 15 schixil districts, and ordinary least squares does not 
require as many assumptions about the distribution of die error terms as a random effects model. Under the alternative 
methods discussed in Appendix A — generalized least squares estimates of a random effects mixlel, maximum likeliluxxl, 
and restricted maximum likelihotxl — the treatment effect of the reading estimate for two-year districts is not statistically 
significant. 
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Table V.7. Impacts on Test Scores, Alternate Model Specifications: Two-Year Districts, 2007-2008 
School Year 



Adjusted Mean 

Test Scores 

Difference 

(Effect 

Size) 



Sample Sizes 


Subject/Model 

Treatment 

Control 

P- value 

Students 

Teachers 

Districts 

Reading 

(1 ) Benchmark 

0.14 

0.25 

0.11* 

0.018 

1,347 

74 

6 

(2) No teacher covariates 

-0.17 

-0.20 

0.04 

0.S62 

1,347 

74 

6 

(3) No teacher or student 

-0.18 

•0.19 

0.01 

0.853 

1,347 

74 

6 

covariates 

(4) Schools weighted 

-0.14 

•0.24 

0.09 

0.052 

1,347 

74 

6 

equally 

(SJ Districts weighted 

-0.1 5 

-0.27 

0.12' 

0.007 

1,347 

74 

6 

equally 

(6) Using specific 

-0.14 

•0.25 

0.11' 

0.018 

1,347 

74 

6 

information on teacher 
assignments 

(7) Without imposing data 

-0.14 

•0.25 

0.11' 

0.018 

1,347 

74 

6 

restrictions 

(8) No pretest, benchmark 

-0.16 

•0.21 

0.0S 

0.496 

1,347 

74 

6 

sample 

(9) No pretest, expanded 

-0.26 

•0.18 

-0.07 

0.392 

2,457 

127 

7 

sample 

(1 0) Compare teachers 

- 0.12 

•0.27 

0.16' 

0.001 

1,484 

82 

6 

within districts, not 

district-grades 

(1 1 ) Instrumental vanables 

- 0.1 1 

•0.27 

0.16' 

0.000 

1,338 

73 

6 

Math 

(1) Benchmark 

0.06 

0.26 

0.20* 

0.000 

1,198 

68 

6 

(2) No teacher covariates 

•0.08 

•0.23 

0.14' 

0.019 

1,198 

68 

6 

(3) No teacher or student 

•0.09 

•0.21 

0.12 

0.077 

1,198 

68 

6 

covariates 

(4) Schools weighted 

-0.07 

•0.26 

0.20* 

0.000 

1,198 

68 

6 

equally 

(SJ Districts weighted 

-0.06 

•0.26 

0.19' 

0.000 

1,198 

68 

6 

equally 

(6) Using specific 

-0.03 

-0.2S 

0.22' 

0.000 

1,161 

66 

6 

information on teacher 
assignments 

(7) Without imposing data 

- 0.01 

-0.23 

0.23' 

0.000 

1,259 

70 

6 

restrictions 

(8) No pretest, benchmark 

-0.08 

•0.23 

0.1 S 

0.054 

1,198 

68 

6 

sample 

(9) No pretest, expanded 

•0.19 

•0.16 

•0.03 

0.739 

2,254 

120 

7 

sample 

(1 0) Compare teachers 

-0.04 

•0.17 

0.13' 

0.046 

1,398 

77 

6 

within districts, not 

district-grades 

(1 1 ) Instrumental vanables 

-0.04 

•0.28 

0.24' 

0.000 

1,193 

68 

6 


Source: Mathematica analysis using data from the 2006-2007 and 2007-2008 school years provided by 

participating school districts: Mathematica Teacher Background Survey administered in fall 200S to 
all study teachers. 

Note: Data pertain to teachers in two-year districts participating in the study. Data are regression adjusted 

to account for clustering of students within schools. See Appendix Table A. I for a list of student 
and teacher covariates used in these models. Treatment and control group sample sizes are shown 
in Appendix Table C.I0. 

•Significantly different from zero at the 0.0S level. 
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The impetus for the first analysis is to determine whether the progression to positive and 
statistically significant impacts in year 3 represents teacher improvement or a shift in the 
composition of the group of teachers in the study sample. Though the study follows a group of 
teachers over three years, the particular group of teachers included in the test score sample varies 
from year to year. This was due to teachers changing teaching assignment, leaving the district, 
leaving teaching, or teaching in a grade where their counterparts in the treatment or control group 
left the sample because each teacher needed a comparison teacher in the same grade and district. In 
this analysis, we restrict the data to a “common sample" of teachers who had students with valid test 
score data in either Year 1 and Year 3 or Year 2 and Year 3.Thc resulting sample sizes for these 
analyses are small, ranging from 37 to 52 teachers, so true changes in relative teacher performance 
may be difficult to detect. 

Results, shown in Appendix C, Table C3.ll, include some evidence of treatment teachers 
improving relative to controls. W ithin the common samples, in reading there was a positive and 
significant improvement for treatment teachers relative to control teachers for the sample of 
teachers common to year 1 and year 3 (p- value = 0.00). In math, there was a positive but not 
statistically significant improvement (p-valuc — 0.14). Common sample results for year 2 and year 3, 
however, show no gain in reading and a small positive but statistically insignificant gain in math 
(p-valuc = 0.38). 

The second exploratory analysis asks whether there were positive impacts on measures of 
induction activities and classroom practices for the samples of teachers who were eligible for the test 
score analysis, consistent with the conceptual framework of teacher induction (Figure 1.1). This 
sample was referred to above as the "benchmark". Results arc shown in Appendix C, Tables (3.12 
and (3.13, and summarized as follows: 

• Treatment teachers in this sample were more likely than controls to report having a 
mentor assigned to them in each of the four surveys during the first two years, while the 
treatment was in place. The ditfcrcnccs, all statistically significant, were more than 
20 percentage points in the first year and more than 50 points in the second year of 
implementation. 

• There were no statistically significant differences for this sample in the overall amount of 
time spent with mentors, but treatment teachers were more likely by statistically 
significant margins of 23 percentage points or more to report receiving suggestions on 
improving instructional practices than their control group counterparts. Also, higher 
percentages of treatment than control teachers in this sample reported having received 
guidance with their particular subject area (math or reading), although the differences 
were not statistically significant for two of the three surveys for reading (p-values — 
0.34.3, 0.000, 0.325 for fall 2005, spring 2006, anti spring 2007, respectively) and were not 
significant for math during any of the surveys (p-valucs = 0.854, 0.073, and 0.126). When 
teachers reported the number of times they received feedback on their teaching outside 
of a formal evaluation, the treatment group reported greater numbers on average in at 
least one of the four surveys during the first two years for math and reading. The other 
differences were not statistically significant. Thus, while treatment teachers in the test 
score samples did not spend more time with mentors than controls, there is some 
suggestive evidence that they spent that time on the types of activities that could lead to 
impacts on student achievement. 
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• For this sample’s performance on the classroom practice outcomes that we observed 
directly in spring 2006 (teachers' first year in the classroom), treatment teachers scored 
higher than control teachers on the measure of classroom culture, although the 
difference (effect size — 0.2) was not statistically significant (p—0.199). Differences for 
the other two teacher practice measures, one positive anti one negative, were smaller in 
absolute value and had even higher p-valucs and thus were not statistically significant 
either. 1 lowcvcr, the sample of 56 teachers in two-year districts that had both valid test 
score data in year 3 and classroom observation data from year 1 (because they had been 
teaching Hnglish/languagc arts) was not large enough to be able to detect impacts (i.c. to 
attain statistical significance) unless the impacts had been very large. 

Overall, these findings provide some evidence that is consistent with the theory that 
comprehensive induction improved student outcomes. I lowcvcr, the small sample sizes and lack of 
statistical significance mean that these findings arc suggestive only and should be interpreted with 
caution. 
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VI. IMPACT FINDINGS: WORKFORCE OUTCOMES 

The main goal of this study is to estimate the impact of comprehensive teacher induction on 
teacher and student outcomes. This chapter focuses on teacher outcomes, particularly teachers’ 
attachment to the workforce. The goal is to determine whether the intervention makes teachers 
more likely to continue teaching in their original district or anywhere else. As a step along the way 
toward retention, we measure teachers' attitudes that relate to career decisions, including their 
satisfaction with teaching and their feelings of preparedness to deal with different aspects of their 
jobs. The ultimate policy goal is not 100 percent teacher retention, but retention of better-qualified 
and higher-quality teachers, so we also present evidence on how comprehensive teacher induction 
affects the mix of teachers who decide to stay in the district and in the profession. 

A. Impact Findings: Teacher Attitudes 

The impact of teacher induction on teacher attitudes is an important signal of whether the 
program is generating its intended effect — an intermediate step on the way to encouraging retention. 
Tlic induction activities surveys indicated that comprehensive induction did not make teachers feel 
more satisfied with or more prepared to do their jobs. There were no statistically significant positive 
impacts of treatment on teacher satisfaction or teacher preparedness at any of the time points that 
we collected data: fall 2005, spring 2006, fall 2006, spring 2007, fall 2007, or fall 2008 for cither one- 
year or two-year districts. 

Using items from the induction activities surveys, we measured teachers’ feelings of satisfaction 
in 19 areas and teachers’ feelings of preparedness in 13 areas. Factor analysis suggested that teacher 
satisfaction and teacher preparedness could be grouped into three categories each: satisfaction with 
(1) school, (2) class, and (3) career; and preparedness to (1) instruct, (2) work with students, and 
(3) work with others (details are given in Appendix A). The constructed scales lor each of these 
categories exhibited internal consistency ranging Irom 0.72 to 0.98, as tested by the Cronbach’s 
alpha coefficient. Psychometric properties for each scale arc given in Appendix A, Table A.5. 

Benchmark estimates for teacher satisfaction and teacher preparedness arc based on a 
hierarchical linear model. As shown in Appendix A, Table A.l, the model has district and grade 
fixed effects and no other covariates. The three satisfaction scales and three preparedness scales 
were entered into separate regression models with the same set of control variables. The results did 
not vary according to estimation method or to the set of control variables we used. 

1. No Impact on Teacher Satisfaction 

Overall, teachers from the treatment and control groups reported feelings of satisfaction that 
differed by 0.1 or less on a four-point scale in fall 21X15, spring 2006, fall 2006, spring 2007, fall 2007, 
and fall 2008.' Out of the 15 differences examined among teachers in one-year districts (three 
measures at five points in time), and the 18 ditferenccs examined among teachers in two-year 
districts (three measures at six points in time), none were statistically significant. Figures VI. 1 and 
VI. 2 show treatment group means represented by a solid line and control group means represented 

*' Teacher satisfaction was not musurnl in one-year districts in spring 200*7. 
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by a dolled line. Sec Tables D.l and D.2 in Appendix 1) for detailed information 
differences, />- values, and sample sizes. 


on point estimates. 


As a sensitivity' test, we recoded the teacher satisfaction data from each time point into two 
categories and examined individual survey items separately, finding that the results were consistent 
with those based on collapsed scales. Out of the 95 differences examined among teachers in one- 
year districts (19 measures at five points in time), one difference was statistically significant. 
Treatment teachers were significantly less likely than control teachers to report satisfaction with 
salary and benefits in fall 2007/* Out of the 1 14 differences examined among teachers in two-year 
districts (19 measures at six points in time), three differences were statistically significant. The results 
showed that treatment teachers were significantly more likely than control teachers to report 
satisfaction with opportunities for professional development in fall 2006 and spring 200 ~ and were 
more likely than control teachers to report satisfaction with school facilities (buildings and grounds) 
in fall 2007. Detailed information from fall 2007 and fall 2008 is available for one-year districts in 
Table D.3 anti for two-year districts in Table D.4 in Appendix D. 


2. No Impact on Teachers’ Feeling of Preparedness 


Overall, teachers from the treatment and control groups reported feelings of preparedness that 
differed by 0. 1 or less on a four-point scale in fall 2005, spring 2006, spring 2007, and fall 2008 in 
both one-year and two-year districts. Of the 9 differences examined among teachers in one-year 
districts (three measures at three points in time), none were statistically significant (Figure VI. 3). Of 
the 12 differences examined among teachers in two-year districts (three measures at four points in 
time), one was statistically significant: treatment teachers were significantly less likely than control 
teachers to report being prepared to instruct in fall 2005 (Figure VI .4). Sec Appendix D for detailed 
information for one-year districts (Table D.5) and two-year districts (Table D.6). 


“ Set Chapter II fur a discussion of multiple- comparisons and false discoveries. 

*' The item specific impacts for fall 2005, spring 2006, fall 2006, and spring 3X1? ran be found in Isenberg et al. 
<2<XK»). 

“• Teacher preparedness was not measured in one-year districts in fall 2006, spring 2007, or fall 2007. Teacher 
preparedness was not measured in two-year districts in fall 2006 or fall 2007. 
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Figure VI. 1. Impacts on Teacher Satisfaction (Scores on a Four-Point Scale): One-Year Districts 
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Source Mathemabca First. Second. Third. Fifth, and Sixih Induclon Activities Surveys administered in fal 2005. 
spnng 2006. fall 2006. fall 2007. and fall 2008 to ail study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Data are 

weighted and regression adjusted to account for differences in districts, teacher grade 
assignments, study design, and the clustering of teachers within schools. Satisfaction 
scale: (I) very dissatisfied, (2) somewhat dissatisfied, (3) somewhat satisfied, or (4) very 
satisfied. Sample sizes vary due to item nonresponse. 

Treatment-control differences are not significantly different from zero at the 0.05 level (N = 498 
teachers in fall 2005, 492 teachers in spring 2006, 472 teachers in fall 2006, 424 teachers in fall 
2007, and 396 teachers in fall 2008). 
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Figure VI. 2. Impacts on Teacher Satisfaction (Scores on a Four-Point Scale): Two-Year Districts 
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Source: Maihematica First, Second, Third, Fifth, and Sixth Induction Activities Surveys administered in fall 

2005, spring 2006, fall 2006. fall 2007, and fall 2008 to all study teachers and Fourth Induction 
Activities Survey administered In spring 2007 to study teachers in two-year districts. 

Note: Data pertain to teachers in all two-year districts participating in the study. Data are weighted and 

regression adjusted to account for differences in districts, teacher grade assignments, study design, 
and the clustering of teachers within schools. Satisfaction scale: (1) very dissatisfied, <2) somewhat 
dissatisfied, (3) somewhat satisfied, or (4) very satisfied. Sample sizes vary due to item 
nonresponse. 

Treatment-control differences are not significantly different from zero at the 0.0S level (N = 391 teachers in fall 
2005, 384 teachers in spring 2006. 359 teachers in fall 2006, 370 teachers in spring 2007, 321 teachers in fall 
2007, and 318 teachers in fall 2008). 
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Figure VI. 3. Impacts on Teacher Preparedness (Scores on a Four-Point Scale): One-Year Districts 
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Source: Mathematica First, Second, and Sixth Induction Activities Surveys administered in fall 

2005, spring 2006, and fall 2008 to all study teachers. 

Note: Data pertain to teachers in one year districts participating in the study. Data are 

weighted and regression adjusted to account for differences in districts, teacher grade 
assignments, study design, and the clustering of teachers within schools. Preparedness 
scale: (I) not at all prepared, (2) somewhat prepared, (3) well prepared, or (4) very well 
prepared. Sample sizes vary due to item nonresponse. 

Treatment-control differences are not significantly different from zero at the 0.0S level (N = 501 
teachers in fall 2005, 493 teachers in spring 2006, and 386 teachers in fall 2008). 
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Figure VI.4. Impacts on Teacher Preparedness (Scores on a Four-Point Scale): Two-Year Districts 




Preparedness to Work with Others 



Source: Mathematics First, Second, and Sixth Induction Activities Surveys administered in fall 

2005, spring 2006. and fall 2008 to all study teachers and Fourth Induction Activities 
Survey administered in spring 2007 to study teachers in two year districts. 

Note: Data pertain to teachers in one-year districts participating in the study. Data are 

weighted and regression adjusted to account for differences in districts, teacher grade 
assignments, study design, and the clustering of teachers within schools. Preparedness 
scale: (I) not at all prepared, (2) somewhat prepared, (3) well prepared, or (4) very well 
prepared. Sample sizes vary due to item nonresponse. 

Treatment-control differences are not significantly different from zero at the 0.05 level except for 
preparedness to instruct in fall 2005 (N = 394 teachers in fall 2005, 381 teachers in spring 2006, 371 
teachers in spring 2007, and 308 teachers in fall 2008). 
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When we recoded the teacher preparedness indicators at each time point into two categories 
and examined individual survey items separately, the findings were consistent with the approach 
taken above. Of the 39 differences examined among teachers from one-year districts (13 measures at 
three points in time), none were statistically significant. Ol the 52 differences examined among 
teachers from two-year districts (13 measures at four points in time), six differences were statistically 
significant. Treatment teachers were significantly less likely than control teachers to report being 
prepared to assess students, sclect/adapt curriculum and materials, plan effective lessons, and be an 
effective teacher in fall 2005; less likely than control teachers to report feeling prepared to work 
effectively with parents in spring 2(XI6; and less likely than control teachers to report feeling 
prepared to work with other teachers to plan instruction in fall 2008. ' Sec Appendix D for detailed 
information from fall 2008 for one-year districts (Table D.7) and two-year districts (Table D.8)." ; 

B. Impact Findings: Teacher Retention 

1. No Impact on Retention Rates 

Neither exposure to one year nor exposure to two years of comprehensive induction had a 
significant impact on teacher retention over the first four years of the teachers’ careers. We surveyed 
teachers annually to learn whether they had remained in their original school, remained in their 
original school district, or remained in the teaching profession. We use the terms school stayer and 
district stayer to refer to teachers who remained in their original school or district, respectively, after 
three years. (All school stayers arc, by definition, district stayers as well.) We use the terms mover and 
i eater , respectively, to refer to teachers who, after three years, left the district hut remained in 
teaching or were no longer teaching. 

Figure VI. 5 illustrates the lack of significant treatment-control differences. It shows a set of 
survival curves that plot the percentage of teachers retained in the district (Panel A) and in the 
profession (Panel B) in each year of the study for the onc-vcar districts. By the end of the study 
period, 69 percent of teachers in one-year districts remained in their original district and 87 percent 
were still teaching. Differences between the treatment group, represented by a solid line, and the 
control group, represented by a dashed line, were statistically insignificant at each time point for 
both types of retention. 

Figure VI. 6 shows the same survival curves for two-year districts. The retention rates by the 
end of the study period were 63 percent in the district and 85 percent in teaching tor two-year 
districts, but the treatment-control differences were not statistically significant. 

When we conducted a similar analysis of retention in the same school, we found a similar 
pattern for both one- and two-year districts; those results arc shown in Appendix D, Tables D.9 and 
D.10, which include more detail, including sample sizes, for all the final retention rates. Detailed data 
on retention rates from the first two years of the study period can be found in earlier reports 
((ilazerman ct al. 2<X18; Isenbcrg ct al. 2009). 


*' Set Chapter II for a discussion of multiple comparisons ami false discoveries. 

“ TTte item-specific impacts for fall 2tX)5 and spring 2006 can be found in (ila/erman it al. (2008). TTit item 
specific impacts for spring 2006 and spring 21X17 can lie fount! in Isenbcrg et al. (2009). 
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Figure VI.5. Survival Curves for One-Year Districts 
Panel A. Percentage Remaining in the District 
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Source: Mathemattca First. Second, and Third Teacher Mobility Surveys administered in fall 2006. fall 2007. and 

fall 2008 to all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. 

Treatment-control differences are not significantly different from zero at the 0.05 level (N = 561 
teachers in fall 2005, 500 teachers in fall 2006, 476 teachers in fall 2007. and 417 in fall 2008 for 
Panel A and 464 teachers in fall 2008 for Panel B). 



I I: Impact Findings: Wonkfom Outcomes 


Figure VI. 6. Survival Curves for Two-Year Districts 
Panel A. Percentage Remaining in the District 
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Source: Mathematics First. Second, and Third Teacher Mobility Surveys administered in fall 2006. fall 2007. and 

fall 2008 to all study teachers. 

Note: Data pertain to teachers in two-year districts participating in the study. 

Treatment-control differences are not significantly different from zero at the 0.05 level (N = 448 
teachers in fall 2005, 382 teachers in fall 2006, 364 teachers in fall 2007, and 345 teachers in fall 
2008 for Panel A and 375 teachers in fall 2008 for Panel B). 
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2. No Impact on Mobility Patterns 

When we examined in more detail where the movers and leavers went, we still did not find 
significant differences between treatment and control group mobility patterns. Figures VI. 7 and VI. 8 
show the mobility outcomes in detail by treatment status for one-year and two-year districts, 
respectively. The hypothesis that the distributions were independent of treatment status could not 
be rejected (p-valuc — 0.929 for onc-vcar districts and 0.447 for two-year districts), suggesting that 
the treatment-control differences were not statistically significant. 

Statistically similar percentages of treatment and control teachers (about 47 percent) stayed in 
the same school in one-year districts, whereas 15 percent moved within the district, 10 percent 
moved to a new district (including charter schools), and 5 percent moved to private schools- Both 
groups included sample members we call unclassified non/eaitrs who were teaching but provided 
insufficient information for us to determine where they were (about 1 1 total). Similar percentages of 
treatment and control teachers left teaching to take another job or attend school or to do something 
else, such as work in the home. Results for two-year districts arc shown in Figure VI.8. 

Figure VI.7. Detailed Mobility Status. Teachers in One-Year Districts 



Source: Mathemabca Third Teacher Mobility Survey administered in fall 2008 to all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Sample sizes are 239 treatment 

teachers and 230 control teachers. 

Treatment-control differences are not significantly different from zero at the 0.05 level. 



I I: Impact Findings: Workfom Outturns 



Source: Mathematics Third Teacher Mobility Survey administered in fall 2008 to all study teachers. 

Note: Data pertain to teachers in tvo-year districts participating in the study. Sample sizes are 212 treatment 

teachers and 171 control teachers. 

Treatment-control differences are not significantly different from zero at the 0.05 level. 


Wc also anticipated the possibility that leavers might return to teaching. Therefore, the mobility 
survey asked respondents when, if ever, they expected to return and, if so, with how much certainty. 
Wc did not find significant differences between the treatment and control leavers in their 
expectations about returning to teaching. The average probability that teachers in one-year districts 
reported for expecting to return to the profession was 60 percent for control leavers and 56 percent 
for treatment leavers, a difference that was not statistically significant (p-valuc — 0.755).' The 
corresponding numbers for two-vear districts were 53 percent of control leavers and 64 percent of 
treatment leavers, a difference that was also not significant (p-valuc = .486).'' 

Next, wc examined the reasons teachers gave for moving to a teaching position outside their 
original school or for leaving the profession and again found no significant differences. Figure VI .9 
lists the reasons that teachers gave for moving, indicating which reasons were cited as the most 
important. Figure VI. 10 presents the reasons cited by those who left teaching. In both cases, the 


*’ The analysis is based on 21 control and 15 treatment leavers. 
The analysis is based on 1 1 control and 16 treatment leavers. 
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Figure VI.9. Self-Reported Reasons for Changing Schools 
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Source: Mathematics Third Teacher Mobility Survey administered in fall 2008 to all study teachers. 

Note: Sample sizes are 1 13 treatment teachers and 1 14 control teachers. 

Treatment-control differences are not significantly different from zero at the 0.05 level. 


Figure VI. 10. Self-Reported Reasons for Leaving Teaching 
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Source: Malhematica Third T eacher Mobility Sucvey admevstered In fall 2008 to all study teachers. 

Note: Sample sizes are 48 treatment teachers and 49 control teachers. 

Treatment-control differences are not significantly different from zero at the 0.05 level. 
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reasons were not significantly different between the treatment and control groups. Figures VI. 9 and 
VI. 10 show the data combined for one-year and two-year districts. The findings within district type 
(not shown) tell a similar story. 

In order to go beyond self-reported reasons, we examined teacher mobility outcomes directly to 
identity the factors statistically associated with retention. Assignment to treatment (comprehensive 
teacher induction) did not explain teachers’ decisions to leave the district or to leave teaching. A 
fuller discussion of these findings is presented in Chapter VII. 

We conducted sensitivity analyses to confirm that the findings of no impacts on retention rates 
were robust. One concern with data that rely on longitudinal surveys is the possibility of systematic 
bias introduced by nonresponse. The response rate to the final teacher mobility survey was 
85 percent overall, which exceeded the study’s original goal of 80 percent, but differed by treatment 
status (90 percent for treatment teachers and 80 percent for control teachers). It nonrespondents arc 
more likely than respondents to be leavers, then one might expect this differential nonresponse rate 
to translate into a negative bias in the estimated impact of treatment on retention. The size of this 
bias can be bounded by assuming that all nonrespondents arc leavers or all arc stayers. The findings, 
presented in detail in Appendix D, suggest that we would find a significant impact of treatment only 
under the most extreme assumptions, such as that 100 percent of nonrespondents to the survey 
were leavers or movers. 

We also dealt with missing data by using information collected by data collectors in the field on 
the status of respondents. For example, we could code the mobility status of a survey respondent 
who refused to complete the survey based on where we found that individual (or failed to find the 
individual) when attempting to complete an in-person interview. When we used these survey field 
codes to augment the mobility variables from self-reports, we continued to find no impact on 
teacher mobility'. Similarly, we reached the same conclusion when we attempted to augment the 
reported measures for item nonrespondents, people who gave incomplete responses to individual 
survey items, by using other information on respondents' records. For example, if a survey 
respondent failed to indicate whether he or she was currently teaching and did not specify the 
person’s current school, but the respondent followed a survey skip instruction meant for current 
teachers or responded to an item about current teacher satisfaction, then the augmented variable 
included the teacher as a stayer. 

Other sensitivity analyses focused on the statistical methods. We rcestimated the impacts using 
a set of logistic regressions to control for teacher background characteristics and using a multinomial 
logit regression, which models mobility as a categorical outcome. In each case, the main findings 
were confirmed. 

C. Composition of the Workforce 

We investigated not only the impacts of treatment on retention, but also the impacts on the 
composition of the teaching force in the district. Although staff turnover can be disruptive and costly 
(Alliance for Hxcellent F.ducation 2004; National Council on Teaching and America’s Future 2003), 
some turnover is inevitable in teaching, as it is in most professions. A critical question is whether 
turnover raises quality by encouraging the weakest teachers to leave or lowers it by discouraging the 
strongest ones from staying ((ioldhaber ct al. 2007). 

To test the impact of treatment on the mix of teachers, we used measures of teachers’ 
professional qualifications, classroom practice ratings from their first year (only for those who taught 
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reading), and student test data trom all three years (only for teachers in tested grades and subjects). 
The test score data were used to construct measures of treatment versus control teacher 
effectiveness at raising test scores, sometimes referred to as value added. Taken together, these 
measures can indicate whether there is a change in the mix of teachers that resulted from differential 
attrition. 

The study’s random assignment design allowed us to test the effects of treatment on the 
composition of the district’s teaching force by comparing the characteristics of treatment stayers to 
control stayers. Because treatment assignment was random, the treatment and control teachers 
should be equivalent, on average, prior to the intervention. After three years, when some teachers 
had left the district (or left teaching), the average quality and qualifications of both groups of 
teachers may have changed. We examined the impacts on the teaching workforce in terms of both 
teachers’ professional characteristics and teachers’ observ ed performance in the classroom. 

No Impact on Stayers’ Background Characteristics. W e found that the treatment did not 
significantly alter the mix ol teachers in terms of their professional characteristics. The top panels of 
Tables VI. 1 and VI. 2 show average values of several teacher characteristics for stayers by treatment 
status for one- and two-year districts, respectively. For each teacher characteristic, we tested the 
hypothesis that comprehensive teacher induction had an impact on the percentage of stayers (people 
who stayed in the same district) with that characteristic because, Irom the perspective of a district 
administrator, the qualifications of those who remain are the most important. We examined 
qualifications such as SAT scores (or ACT equivalent), scleetivcncss ol the teachers’ undergraduate 
college, anti highest degree obtained. We also examined the percentage of teachers who had a 
college major or minor in an education-related field, the amount of prior student teaching 
experience, certification status, and the route by which the teacher entered the profession. The tables 
show that there were no significant differences between the professional background characteristics 
of treatment stayers and control stayers. 

No Impact or Negative Impact on Stayers’ Prior Classroom Performance. We 
hypothesized that comprehensive teacher induction may have both productivity effects (helping 
teachers improve their practices) and composition effects (improving the mix of teachers by 
retaining strong teachers and encouraging weaker teachers to leave). In Chapter V, the experimental 
impacts on classroom practices that were measured in year 1 captured evidence on productivity 
effects because there had been no year-to-year attrition yet, and the impacts on test scores that were 
measured in year 3 captured evidence on combined productivity and composition effects. 1 lerc we 
explore pure composition clfects by asking whether the performance of treatment teachers on these 
same outcomes was higher relative to control teachers for those who remained in the district until 
some later time point. The bottom panels of Tables VI. 1 and VI.2 provide this type of evidence for 
one- anti two-year districts, respectively. 

The classroom practice measures Irom year 1 arc contrasted for treatment and control teachers 
who remained in their districts through the end ol the study period (the beginning of their fourth 
year). ITie test score outcomes from year 3 are contrasted for treatment and control teachers in that 
particular analysis who remained in their districts through the end of the study period as well. 
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Table VI. 1. Characteristics of District Stayers After Three Years, by Treatment Status (Percentages Except 
Where Noted): One-Year Districts 


Teacher Characteristic 

Treatment 

Control 

Difference 

P-value 

Background Characteristic 





College entrance exam score 
(SAT combined score or equivalent) 

1040 

1013 

27 

0.325 

Attended highly selective college 

27.5 

27.2 

0.3 

0.9 54 

Major or minor in education 

78.7 

80.9 

•2.1 

0.665 

Student teaching experience (weeks) 

1 S.8 

15.4 

0.4 

0.772 

Highest degree Is master's or doctorate 

22.4 

28.2 

5.8 

0.31 1 

Entered the profession through 
traditional four-year program 

67.6 

58.9 

8.7 

0.171 

Certified (regular or probationary) 

94.7 

94.7 

0.0 

0.999 

Career changer 

14.5 

12.8 

1.7 

0.682 

Sample Size (Teachers) 

148 

139 



Sample Size (Schools) 

88 

84 



Year 1 Classroom Observation Score (on 1 
to 5 scale) 





Content of a literacy lesson 

2.3 

2.6 

0.3* 

0.024 

Implementation of a literacy lesson 

2.6 

2.8 

0.2 

0.1 SI 

Classroom culture 

3.0 

3.1 

0.1 

0.607 

Sample Size (Teachers) 

100 

94 



Sample Size (Schools) 

71 

65 



Year 3 Test Scores (standard deviation 
units) 





Reading 

0.27 

0.28 

0.02 

0.764 

Math 

0.26 

0.1 1 

0.1 S* 

0.008 

Sample Size (Teachers) 

25 

36 



Sample Size (Schools) 

23 

29 




Source: Mathemabca analysis using data from the College Board and ACT. Inc.: data from the 2006-2007 and 

2007-2008 school years provided by participating school districts: Mathematica Third Teacher Mobility 
Survey administered in fall 2008 to all study teachers; Mathematica classroom observations conducted 
in spring 2006 . 

Note: Data pertain to teachers in one-year districts participating in the study. Data are weighted to account for 

the study design. The analysis of college entrance exam scores relied on a smaller sample (84 
treatment and 86 control teachers and 61 treatment and 62 control schools). The analysis of Year 3 
Test Scores relied on a different sample for reading (26 treatment and 34 control teachers and 24 
treatment and 27 control schools) and math (per table values). 

•Significantly different from zero at the 0.05 level. 


The findings suggest a mix of neutral and negative composition effects of comprehensive 
induction. Of the classroom practice measures, there was a significant negative impact of treatment 
on the average rating of teachers’ lesson content in one-vear districts (difference — -0.3, p-valuc — 
0.024), but no significant differences between treatment and control stayers in the other two 
classr<x»m practice measures or in the two-year districts. 
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Table VI.2. Characteristics of District Stayers After Three Years, by Treatment Status (Percentages Except 
Where Noted): Two-Year Districts 


Teacher Characteristic 

Treatment 

Control 

Difference 

P-value 

Background Characteristic 





College entrance exam scores 
(SAT combined score or equivalent) 

905 

935 

30 

0.330 

Attended highly selective college 

23.7 

21.4 

2.3 

0.703 

Major or minor in education 

67.8 

66.6 

1.2 

0.874 

Student teaching experience (weeks) 

12.3 

12.3 

0.1 

0.975 

Highest degree Is master’s or doctorate 

16.7 

10.2 

6.5 

0.196 

Entered the profession through traditional 

61.1 

66.4 

-5.4 

0.443 

four-year program 

Certified (regular or probationary) 

95.8 

92.9 

2.9 

0.366 

Career changer 

17.1 

1 1.7 

S.4 

0.292 

Sample Size (Teachers) 

124 

93 



Sample Size (Schools) 

67 

52 



Year 1 Classroom Observation Score (on 1 to 5 
scale) 





Content of a literacy lesson 

2.4 

2.4 

0.0 

0.690 

Implementation of a literacy lesson 

2.7 

2.6 

0.1 

0.583 

Classroom culture 

3.1 

3.1 

0.1 

0.624 

Sample Size (Teachers) 

87 

62 



Sample Size (Schools) 

50 

41 



Year 3 Test Scores (standard deviation units) 





Reading 

0.23 

0.27 

0.05 

0.302 

Math 

0.12 

0.24 

0.1 1 

0.0S4 

Sample Size (Teachers) 

31 

16 



Sample Size (Schools) 

21 

14 




Source: Mathemabca analysis using data from the College Board and ACT. Inc.: data from the 2006-2007 and 

2007-2008 school years provided by participating school districts: Mathematics Third Teacher Mobility 
Survey administered m fall 2008 to all study teachers; Mathematics classroom observations conducted 
in spnng 2006. 

Note: Data pertain to teachers in two-year districts participating in the study. Data are weighted to account for 

the study design. The analysis of college entrance exam scores relied on a smaller sample (56 
treatment and 47 control teachers and 40 treatment and 35 control schools). The analysis of Year 3 
Test Scores relied on a different sample for reading (33 treatment and 17 control teachers and 24 
treatment and 15 control schools) and math (per table values). 

None of the differences is statistically significant at the 0.05 level. 


The estimates of impact on test score outcomes for stayers only were qualitatively different 
from the estimates derived for the whole sample of teachers who finished the third year. In Chapter 
V we showed that test score impact estimates were statistically insignificant in one-year districts in 
both subjects and positive and significant in two-year districts in both subjects. When we condition 
on teachers’ returning for a fourth year, we find that the impact on math in one-year districts is 
negative and significant (difference — -0.15, p-valuc = 0.008) and the impact on reading in one-year 
districts is still insignificant (p-valuc = 0.764). Furthermore, the impacts on both math and reading 
scores, which had been significant for two-year districts, were no longer significant when we 
restricted the analysis to the district stayers (p-valuc = 0.302 for reading; p-valuc — 0.054 for math). 
Because the conclusions change, one might suspect a negative composition effect. I lowcvcr, 
statistical hypothesis testing of the differences between the two sets of findings suggests that these 


114 











I 7: Impact Findings: Workforce Outcomes 


apparent negative composition effects may in fact be due to chance variations in the teacher sample 
and not necessarily a true effect of the treatment. 

Taken together, the findings on composition effects suggest that comprehensive teacher 
induction did not significantly improve teacher quality in the district. We reached the same 
conclusion when we used alternative methods for constructing hypothesis tests. 1 Sensitivity tests 
are presented in Appendix D. 


M For example, rather than test the significance of the coefficient estimate for the treatment variable in a regression 
model predicting each characteristic, we also tested whether the interaction of treatment status and stayer status was a 
significant predictor, beyond Stayer status and treatment status alone, of the variables listed in each row of Tallies VI. 1 
and VI.2. These tests confirmed the results shown earlier. 
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VII. CORRELATIONAL ANALYSES 

Wc have shown that the treatment and control groups in both one-year and two-year districts 
were equivalent on baseline characteristics (Chapter II) and then were exposed to different levels of 
beginning teacher support during their first two years (Chapter IV). Wc also showed, however, that 
treatment-control differences in induction services did not translate into a robust finding of positive 
impacts across both one-year and two-year districts as hypothesized in the conceptual framework in 
Figure I.l. By the end of Year 3, comprehensive induction showed no impacts on teacher attitudes 
or retention across one-year and two-year districts. The impact estimates for the benchmark test 
score sample showed mixed results: no impacts at the end of year 3 in one-year districts and positive 
impacts in two-year districts (Chapter V). For those continuing to the beginning of year 4, the 
benchmark impact estimates were statistically insignificant, or in the ease of math scores in one-year 
districts, negative and significant (Chapter VI). 

This chapter attempts to answer a new set of questions raised by these findings. First, wc ask 
whether there is a relationship between these outcomes and the level or intensity of induction 
sendees more generally, even when there are no impacts associated with the treatment-control 
contrast (“comprehensive” versus “prevailing" induction services). I lere wc discuss how the full 
range of variation in induction activities was related to teacher attitudes, student test scores, and 
teacher retention, both within and between treatment and control groups. ‘ 

Second, wc report on whether better outcomes arc associated with matching between the 
mentor and mentcc on two dimensions, racc/cthnicity and grade. Wc conducted this analysis using 
the treatment group, which is the part of the sample for which wc have detailed information on 
mentor background. 

The results presented in this chapter should be interpreted with caution because the 
relationships we describe are correlational and not necessarily causal. Unlike with the randomized 
experiment that used variation in treatment status, the variation in induction services that wc explore 
here can be caused by confounding factors that also explain teachers’ workforce attachment and 
effectiveness in the classroom and is therefore nonexperimcntal. In particular, a noncxperimcntal 
estimate of the association of induction services with outcomes may he spurious, as it will confound 
the true (causal) impact of mentoring with the effect of the teacher’s own ability or motivation. For 
example, a high level of services for a particular teacher may result from a principal’s or mentor’s 
decision to help struggling teachers who would likely have had poor outcomes anyway. Alternately, a 
high level might he obtained if an assertive, motivated teacher, who would have had positive 
outcomes regardless, takes the initiative and spends extra time with a mentor. 

A. Nonexperimental Methods 

Unlike the experimental analysis, which used a single variahlc — assignment to the treatment 
group — to measure the effect of induction on outcomes, the nonexperimental analysis uses four 
summary measures of induction support, all of which vary within each treatment group as well as 
between them. To construct these measures, wc considered three primary dimensions on which 

w For a correlational analysis of classroom practices, see Glaze rmm et al. (2008). 
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icachcr induction programs vary: ihc breadth, intensity, and instructional locus of services (Ingcrsoll 
and Kralik 2004). We constructed indices corresponding to each of these dimensions and selected 
the induction service components for each index based on suggestions in the literature (Portncr 
2005) and the emphasis that ETS and NTC placed on these components in their comprehensive 
induction programs (sec Chapter IV). We added a fourth measure, a count of the number of years 
that beginning teachers had a mentor assigned to them within their first two years. The measures of 
internal consistency (Cronbach’s alpha) for the three index variables range from 0.26 to 0.54. Lower 
internal consistency measures suggest that indices may measure multidimensional constructs, so the 
results presented in this chapter should he interpreted with caution. Details on the statistical 
properties of the variables arc given in Table A.5 in Appendix A. 

We constructed an Induction Breadth Index by counting how many of the following three 
activities the teacher reported in the three months prior to each survey, on average: 

• Met with a literacy or math coach 

• Worked with a study group (with new or both new and experienced teachers) 

• Observed others teaching 

We did this for three points in time: fall 2005, spring 2006, and fall 2006, and computed a 
weighted average of the values from the three reports that placed twice as much weight on the fall 
2006 measure because it was the only measure for all teachers from their second year of teaching. 
This was equivalent to assuming that spring 2007 measures were identical to fall 2006 measures. For 
example, a teacher with no activities in fall 2005, two activities in spring 2006, and three activities in 
fall 2006 would have an Induction Breadth Index of 2.0, which equals the average over four time 
points of zero, two, anti two times three. The breadth index ranges from 0.0 (never received any of 
the supports) to 3.0 (reported receiving all three supports at each of the three time points). The 
average value of the breadth index was 1.8 activities and the standard deviation was 0.7. 

We constructed an Instructional Focus Index by examining another set of indicator variables 
at the same three time points averaged in the same way. These indicators measure whether the 
beginning teacher received: 

• Suggestions from a mentor to improve his/her practices during the most recent full 
week of teaching 

• A “mtxlcratc amount” or “a lot" of guidance in subject area content during the prior 
three months'* 

• Feedback on teaching, whether or not as part of a formal evaluation, during the prior 
three months 


M Additional dimensions include the types of teacher served by a program (new to teaching or new to a school) 
and the process for selecting and training mentor. 

M This variable was constructed using a survey question on math content if the outcome to lie analv/.ed is math 
scores, literacy content if the outcome is reading scores, and math or reading if the outcome is teacher attitudes or 
mobility. 
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Because the question on subject area guidance (the second item above in the Instructional 
Focus Index) was not included in fall 2(106 survey, the maximum score for the second year is 2.0 
(not 3.0) and the index ranges from 0.0 to 2.5 (not .3.0). The index can be interpreted as measuring 
the strength of instructional support received hv beginning teachers or more literally the number of 
instructionally focused supports reported in a given period. The average value was 1.6 supports and 
the standard deviation was 0.6. 

For program duration and intensity, we constructed an Induction Intensity Index by 
averaging the number of hours per week' that beginning teachers reported spending in the 
following four activities in the fall 2005, spring 2006, and fall 2006: 

• Mentoring sessions (both scheduled and informal) 

• Being observed teaching by a mentor 

• Professional development (for example, in-service workshops, study groups, seminars, 
and continuing education courses) learning instructional techniques and strategics 

• Professional development learning content area knowledge, specifically language arts, 
math, and science 

Since the questions on time spent on professional development activities were not included in 
the tall 2006 survey, the index only counts professional development in the first year and takes on 
values from 0 to the maximum reported time of 5.7 hours per week. The average value was 0.36 
hours (21.5 minutes) per week, with a standard deviation of 0.36. To test the robustness of the main 
findings we also examined variations on this index where we used subsets of the activities, such as 
hours of informal time with a mentor (by itself) and total time with a mentor (by itself). 

Finally, we used an indicator of whether the beginning teacher was assigned a mentor in the 
fall 2005, spring 2006, and fall 2006 (3 items) to create a variable that reflects the number of years 
(0, 1, or 2) the beginning teacher had an assigned mentor. As with the other indices, we averaged the 
first two time points to measure the first year and used the third time point to measure the second 
year. The average value of this measure was 1.2 years, standard deviation of 0.6. To test the 
robustness of the findings to different definitions of having a mentor we also examined a similar 
measure that used any mentor (assigned or not) and the number of mentors, since survey 
respondents were asked how many mentors they had at each time point. 

The measures of induction support arc closely related to each other and do not represent 
completely independent information. Table VII. 1 shows the correlations among the four induction 
support variables. The correlations arc all positive and range from 0.13 to 0.53, meaning that 
teachers who reported a greater breadth of induction supports also tended to report a greater 
intensity of support, instructional focus, and were more likely to report having an assigned mentor. 
The implication for our analysis is that it can be difficult to distinguish the unique effects of each 
dimension of teacher induction from their joint effect. Therefore we estimated the regressions in 


ss Time spent in mentoring sessions is measured during a typed week; time spent being observed by a mentor is 
measured during the most recent lull week of teaching; time spent in the two type* of professional development 
activities is measured during a three-month period. For the Induction Intensity Index, the professional development 
measures are converted to a weekly eijuivalent and added to the first two measures. 
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two ways. One way was l<> include all the induction measures in a single regression model. This has 
the advantage of allowing us to interpret the estimates as the unique contribution of each dimension, 
hut the disadvantage of statistical imprecision, which makes it more likely that we falsely declare a 
true relationship to he insignificant. 'Hie other way we estimated the model was to conduct separate 
regressions for each dimension of induction support entered on its own. 1’his approach provides us 
with more precise estimates, hut ones that may conflate the dimensions. We report results from 
both approaches as well as the result of a hypothesis test of whether the four measures entered 
together are jointly statistically significant, which indicates whether or not there is an overall impact 
of induction supports on a particular outcome. Appendix A provides more details ol the statistical 
model. We conducted a number of sensitivity analyses using alternate constructions of the indices 
and specifications of the regression model. 


Table VII.1. Summary Measures of Induction Support 


Statistic/Induction 

Measure 

Breadth 
(Number of 
Activities) 

Instructional 
Focus 
(Number of 
Supports) 

Intensity 
(Hours per 
Week) 

Assigned 

Mentor 

(Years) 

Correlations 





Breadth 

1.00 




Instructional Focus 

0.314 

1.00 



Intensity 

0.163 

0.435 

1.00 


Assigned Mentor (Years) 

0.127 

0.530 

0.435 

1.00 

Summary Statistics 





Minimum 

0.0 

0.0 

0.00 

0.0 

Mean 

1.8 

1.6 

0.36 

1.2 

Maximum 

3.0 

2.5 

5.65 

2.0 

Standard Deviation 

0.7 

0.6 

0.36 

0.6 

Observations (Teachers) 

959 

965 

965 

901 


Source: Mathematica First, Second, and Third Induction Activities Surveys administered in fall 

2005, spring 2006, and fall 2006 to all study teachers. 

Note: Breadth is the average number of these activities over a three month period: (I) met with 

a literacy or math coach, (2) met with a study group, and (3) observed others teaching. 

Instructional Focus is the average number of these three supports: (I) suggestions from 
a mentor to improve his/her teaching. (2) at least a moderate amount of guidance in 
subject area content, and (3) feedback on teaching. 

Intensity is the average number of hours per week that beginning teachers reported 
spending: (I) in mentoring sessions, (2) being observed teaching by mentor, (3) in 
professional development learning instructional techniques and strategies, and (4) in 
professional development learning content area knowledge, specifically language arts, 
math, and science. 

Race/ethnicity match and grade match were measured as dichotomous variables indicating 
whether the mentor and mcntcc matched on these characteristics. Because wc have data on the 
characteristics of the mentors for the treatment group only, these measures were constructed only 
for the treatment group. The mentor survey and teacher background surveys were administered in 
fall 2005. Wc collapsed race/ ethnicity into the following categories: white, African-American, and 
I lispanic. For these analyses, we replaced assignment to treatment status with a variable indicating 
with the mentor and mcntcc were of the same race/cthnicity and whether they taught the same 
grade in the first year of the study. 
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Although the induction activities measures arc drawn trom the first two years of the study, the 
outcome measures generally pertain to the third and fourth (final) year of the study. The student 
achievement results measure outcomes for students taught by study teachers in the 2007-2008 
school year. Teacher attitudes reported here were measured in tall 2007, after the start of the 
teachers’ third year, and fall 2008, after the start of the teachers' fourth year. Teacher retention was 
measured in fall 2008. It is important to recognize that all analyses pertain to teachers who persisted 
in the profession until at least the beginning of the second year of the study so we could observe 
two years of induction activities. 

The analyses use the same regression methods as the experimental analyses presented in 
Chapters V and VI, but instead of assignment to treatment status as the key explanatory variable, we 
used measures, described above, of induction support or matching between mentor and beginning 
teacher. For the first set of analyses, the key explanatory variables were the three indices of induction 
sendees and the number of years the teacher had an assigned mentor. F.aeh regression model 
includes control variables that match the covariates used in the corresponding experimental analysis 
for the outcome under study (see Appendix A). 

B. Nonexperimental Results 

Teachers Receiving More Induction Support Also Reported Higher Satisfaction. As 
reported in Table VI 1.2, beginning teachers who received more induction support reported being 
more satisfied, on average, than those who received less. The tour induction measures were 
collectively related to satisfaction with school (p-valuc = 0.02) and satisfaction with class (p-valuc — 
0.01), but not satisfaction with the teaching career (p-valuc = 0.07). 

Induction intensity anti instructional focus stood out as the two aspects of support that were 
positively related to teacher attitudes. (Breadth of induction had a negative relationship with 
satisfaction, but was not significant when estimated on its own). Adding one instructionally focused 
support (among those in the instructional focus index) over a three-month period is associated with 
an increase in satisfaction of 0.13 points on a four point scale of satisfaction with school, 0.09 points 
for satisfaction with class, and 0.09 points for satisfaction with career. One way to describe the size 
of these relationships is to note that the estimate of 0.13 for satisfaction with school is equivalent to 
moving 13 percent of the way between “somewhat satisfied' and “very satisfied." An increase in one 
hour per week with a mentor in any of the activities in the induction intensity index is associated 
with increases in reported satisfaction of 0.04 for each of the three dimensions of satisfaction. 

The relationship of induction services to teachers' reported feelings of preparedness exhibited a 
similar pattern but with fewer statistically significant relationships. We examined the same three 
dimensions of preparedness that we covered in the experimental analysis: preparedness to work with 
students, preparedness to instruct, and preparedness to work with other staff. The joint test 


We included teachers who did not complete all three induction activities surveys it they had completed the third 
induction Survey in fall 2006 and at least one of the other induction surveys from fall 2(XIS and spring 2006. If necessary, 
we imputed missing values from the missing Survey using data from the non missing Surveys. 
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Table VII.2. Association Between Induction Support and Teacher Satisfaction 



Satisfaction with School 


Satisfaction with Class 

Satisfaction with Teaching Career 


Joint Model 

Separate 

Regressions 

Joint Model 

Separate 

Regressions 

Joint Model 

Separate 

Regressions 

Induction Measure 

Coef 

ficient 

Pvalue 

Coef 

ficient 

Pvalue 

Coef 

ficient 

P value 

Coef 

ficient 

Pvalue 

Coef 

ficient 

Pvalue 

Coef 

ficient 

Pvalue 

Breadth 

Instructional Focus 

0.01 

0 . 1 1 * 

0.870 

0.043 

0.03 

0.13' 

0.366 

0.002 

0.08* 

0.09 

0.021 

0.096 

0.05 

0.09* 

0.1 58 

0.024 

0.02 

0.07 

0.S32 

0.179 

0.00 

0.09* 

0.893 

0.029 

Intensity 

0.02 

0.260 

0.04* 

0.012 

0.03 

0.104 

0.04* 

0.01 1 

0.03 

0.062 

0.04* 

0.009 

Assigned Mentor (Years) 

0.01 

0.889 

0.07 

0.067 

0.01 

0.900 

0.06 

0.133 

0.02 

0.600 

0.04 

0.318 

Joint Test (all four 
measures different from 
zero) 

♦ 

0.021 



♦ 

0.010 




.070 



Sample Size (Schools) 


347 




348 




348 



Sample Size (Teachers) 


689 




693 




695 




Source: Mathematica First, Second, and Third Induction Activities Surveys administered in fall 2005, spring 2006, and fall 2006 to all 

study teachers. 

Note: Breadth is the average number of these activities over a three month period: (1) met with a literacy or math coach, (2) met with 

a study group, and (3) observed others teaching. 

Instructional Focus is the average number of these three supports: (1) suggestions from a mentor to improve his/her teaching, 
(2) at least a moderate amount of guidance in subject area content, and (3) feedback on teaching. 

Intensity is the average number of hours per week that beginning teachers reported spending: (I) in mentoring sessions, (2) 
being observed teaching by mentor, (3) in professional development learning instructional techniques and strategies, and (4) in 
professional development learning content area knowledge, specifically language arts, math, and science. 

Regressions include pretest, district-by grade fixed effects, and account for clustering of students within schools. 

'Significantly different from zero at the 0.05 level. 
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sug ge sted a significant relationship with preparedness to work with others (p-valuc — 0.01), but an 
insignificant relationship with the other two dimensions. The only one of the four induction services 
measures that was significantly associated with preparedness to work with others was Induction 
Intensity, with an association of 0.04 (p-value = 0.01). 

None of the findings for teacher attitudes changed when we varied the method for combining 
information Irom the four induction activities surveys conducted in the first two years. That is, we 
applied dilferent weights to the fall of the first year, spring of the first year, fall ol the second year, 
and spring of the second year (which was only measured for teachers in two-year districts). One set 
of weights counted the first three time points equally. Another counted only the first two time- 
points. Another used all four time points, but used only two-year districts, since the fourth survey 
was not administered in one-year districts. 

Nor did the findings change when we examined teacher attitudes measured in the fourth year as 
opposed to the same outcomes measured in the third year. The fourth year outcomes are more 
recent and hence pick up longer term trends, but they also must be interpreted differently because 
they only pertain to the teachers who remained in the profession through the beginning of the 
fourth year. One might expect that this sample of stayers excludes those whose induction 
experiences reduced their satisfaction and consequently influenced their decision to leave. 

Increased Induction Support Was Not Associated with Higher Test Scores. W e did not 
find consistent evidence that induction support was related to student achievement in math or 
reading, as shown in Table VII. 3. For math, the joint test using all four induction measures was 
insignificant. There were three insignificant relationships (breadth, instructional locus, and intensity) 
and one positive and significant relationship (assigned mentor) when we estimated the effect ol the 
four induction indices on math scores simultaneously. When we entered the indices one at a time, 
none of the induction measures has a significant association with math scores. 

For reading scores, the test of a joint effect of induction measures was not significant and the 
regression coefficient estimates for all of the individual index measures were statistically 
insignificant. Taken together, these findings do not present a pattern of significant impacts and 
therefore do not support the hypothesis that teachers who received more induction sen-ices 
produced higher student test scores. 

The relationships between induction services and test scores were also statistically insignificant 
when we used diflercnt modeling assumptions. When we measured induction scn ices received in a 
variety of different ways and used a variety of mixlcl specifications, we did not find evidence of a 
significant association. 

Increased Induction Support Was Not Associated With Higher Retention. We found that 
none of the four measures of beginning teacher support was related to retention in the district or in 
the profession ("Table VII.4). We conducted many of the same specification checks described above 
and reached the same conclusion regardless of how we specified the model, defined the variables, or 
delineated the sample. 
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Table VII.3. Association Between Induction Support and Test Scores 




Math 


Reading 



Joint Model 

Separate 

Regressions Joint Model 

Separate 

Regressions 

Induction 

Measure 

Coef- 

ficient 

P-value 

Coef 

ficient 

Coef 

P-value ficient 

P-value 

Coef 

ficient 

P-value 

Breadth 

0.02 

0.650 

0.01 

0.902 0.04 

0.180 

0.04 

0.172 

Instructional 

Focus 

0.07 

0.126 

0.02 

0.622 0.01 

0.834 

0.01 

0.787 

Intensity 

0.00 

0.644 

0.00 

0.546 0.00 

0.713 

0.00 

0.633 

Assigned Mentor 
(Years) 

0.09' 

0.046 

0.06 

0.130 

0.03 

0.448 

0.02 

0.560 

Joint Test (all 
four measures 
different from 
zero) 


0.270 



0.635 



Sample Size 
(Teachers) 


141 



149 


Sample Size 
(Students) 


2,405 


2.607 



Source: Mathematica analysis using data from the 200S-2006 and 2006 2007 school years 

provided by participating school districts: Mathematica First, Second, and Third 
induction Activities Surveys administered in fall 2005, spring 2006, and fall 2006 to all 
study teachers. 

Note: Breadth is the average number of these activities over a three month period: (I) met with 

a literacy or math coach, (2) met with a study group, and (3) observed others teaching. 

Instructional Focus is the average number of these three supports over a three month 
period: (I) suggestions from a mentor to improve his/her teaching, (2) at least a 
moderate amount of guidance in subject area content, and (3) feedback on teaching. 

Intensity is the average number of hours per week that beginning teachers reported 
spending: (I) in mentoring sessions, (2) being observed teaching by mentor, (3) in 
professional development learning instructional techniques and strategies, and (4) in 
professional development learning content area knowledge, specifically language arts, 
math, and science. 

Regressions include pretest, district by grade fixed effects, and account for clustering of 
students within schools. 

‘Significantly different from zero at the 0.05 level. 
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Table VII.4. Association Between Induction Support and Teacher Mobility 




Remains in District 


Remains in Teaching 



Joint Model 

Separate 

Regressions 

Joint Model 

Separate 

Regressions 

Induction Measure 

Effect 

P-value 

Effect 

P-value 

Effect 

P-value 

Effect 

P-value 

Breadth 

0.03 

0.343 

0.03 

0.333 

0.00 

0.855 

0.00 

0.964 

Instructional Focus 

-0.03 

0.480 

0.01 

0.868 

0.00 

0.644 

0.00 

0.598 

Intensity 

0.01 

0.312 

0.01 

0.294 

0.00 

0.414 

0.00 

0.207 

Assigned Mentor (Years) 

0.02 

0.675 

0.02 

0.612 

0.01 

0.194 

0.01 

0.119 

Joint Test (all four 
measures different from 
zero) 


0.623 




0.488 



Sample Size (Schools) 



338 




359 


Sample Size (Teachers) 



678 




750 



Source: Mathematica Teacher Background Survey administered in fall 2005, Mathematica Third 

Teacher Mobility Survey administered in fall 2008, and Mathematica First, Second, and 
Third Induction Activities Surveys administered in fall 2005, spring 2006, and fall 2006 
to all study teachers. 

Note: Breadth is the average number of these activities over a three month period: (I) met with 

a literacy or math coach, (2) met with a study group, and (3) observed others teaching. 

Instructional Focus is the average number of these three supports over a 3 month 
period: (I) suggestions from a mentor to improve his/her teaching, (2) at least a 
moderate amount of guidance in subject area content, and (3) feedback on teaching. 

Intensity is the average number of hours per week that beginning teachers reported 
spending: (I) in mentoring sessions, (2) being observed teaching by mentor, (3) in 
professional development learning instructional techniques and strategies, and (4) in 
professional development learning content area knowledge, specifically language arts, 
math, and science. 

Regressions use a logit model to account for baseline characteristics and robust 
standard errors to account for clustering of teachers within schools. Marginal effects are 
reported instead of coefficients. 

‘Significantly different from zero at the 0.05 level. 


Match With Mentor on Race or Grade Was Associated with Negative Retention 
Outcomes. Beginning teachers who had the same racc/cthnicity as their mentor or taught the same 
grade as had their mentor had lower rates of retention in the district and in the profession than 
those who did not have such a match, contradicting the hypothesis that better matching would 
produce better outcomes. The regression-adjusted differences were 17 percent for a match on 
racc/cthnicity and 16 percent for grade level (see Table VI 1.5). When we examined the other two 
outcomes, teacher attitudes and student achievement, we found no evidence of a statistically 
significant relationship with a mentor match. 
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Table VII.5. Association Between Mentor Match and Attitudes. Test Scores, and Retention 


Outcome 


Attitudes, Satisfaction with 
School. Class, and Career 

Test Scores 

Retention 


School 

Class 

Career 

Math 

Reading 

In District 

In 

Teaching 

Race/Ethnicity 

Match 

Coefficient 

P-value 

-0.04 

0.722 

0.12 

0.303 

0.06 

0.589 

0.03 

0.768 

-0.06 

0.428 

-0.17* 

0.037 

-0.06- 

0.006 

Sample Size 
(Schools) 

159 

160 

160 

- 

- 

,66 

171 

Sample Size 
(Teachers) 

341 

344 

345 

73 

76 

363 

396 

Sample Size 
(Students) 

-- 

- 

-- 

1.296 

1.415 

- 

- 

Grade Match 

Coefficient 

P-value 

-0.07 

0.311 

-0.01 

0.885 

-0.05 

0.487 

-0.01 

0.841 

-0.08 

0.107 

-0.16* 

0.003 

0.01 

0.533 

Sample Size 
(Schools) 

182 

183 

183 

- 

- 

185 

193 

Sample Size 
(Teachers) 

376 

380 

381 

79 

83 

392 

431 

Sample Size 
(Students) 

- 

- 

- 

1.392 

1.549 

— 


Source: Mathematica analysis 

using data 

from the 

2005-2006 and 

2006-2007 

school years 

provided by 


participating school districts; Mathematica First. Second, and Third Induction Activities Surveys 
administered in fall 2005. spring 2006. and fall 2006 to al study teachers. 


Note: Data are regression-adjusted to account for pretest, district-by-grade fixed effects, and clustering of 

students within schools. 

•Significantly different from zero at the 0.05 level. 
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APPENDIX A 

SUPPLEMENTAL INFORMATION FOR CHAPTERS II AND III 

This appendix provides technical details of the impact estimation method and discusses the test 
score data in greater depth. 

A. Impact Estimation 

Basic Model. To estimate the effects of comprehensive teacher induction on outcomes, we 
implemented a two-level regression model. The first level corresponds to teachers (for the classroom 
practices, teacher attitudes and retention analyses) and the second level to schools. Treatment effects 
arc estimated in the level two model, in which the sample size is dictated by the number of schools, 
not teachers. The basic form of the model for the teacher attitudes and retention analyses is 
presented in liquations (A. 1) and (A.2), which express teacher-level analyses (A.l) and school-level 
analyses (A.2): 


K=c,+fi'X v +e„ 

(A-l) 

Cfmjt + STj + y'Zf+u, 

(A.2) 


where y’, is the outcome of interest for teacher / in school f, c is a school-specific intercept; X y is 
a vector that includes baseline teacher characteristics; C u is an independently and identically 
distributed teacher-level random error term that captures the effects of unobserved factors that 
influence the outcome; T t is an indicator that equals 1 if school j was randomly assigned to the 
treatment group (receiving services from one of the two comprehensive induction programs) and 
equals 0 otherwise; Z. includes school characteristics; u is a random component representing 
unobserved factors that vary by school (the random “school effect”); and /?, //, A, and y arc 
parameters or vectors of parameters to be estimated. W’c also must estimate the variance of the 
school effects u f . 


By substituting liquation (A.2) into liquation (A.l), we can express the unified model as 
liquation (A. 3): 




(A3) 


In liquation (A.3), in place of the generic outcome Y., we substitute classroom practices, 
teacher satisfaction, or teacher retention data. Teacher mobility outcomes arc binary or categorical. 
In one model specification, we use an indicator for whether the teacher returned for a fourth year of 
teaching. In another, we use a variable with separate categories for remaining in, moving within, or 
leaving the teaching profession. In the case of categorical outcome variables, we use bivariate or 
multinomial logistic regression to estimate the parameters of Equation (A.3). 
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The student achievement analysis is mathematically similar, liquations (A.4) and (A. 5) express 
the basic student achievement model, with the unified model expressed by liquation (A.6): 

n* -* y + *' V, +^AT # +*• IF*, +«* (A.4) 

c, = /J + ST J +y , Z i +u l (A-5) 

-ti+STj+vr^+frx, +<p'\v lu) +y'z ) + k +% j (a.6) 

liquation (A.6) differs Irom liquation (A.3) in two main ways. First, the level one units arc 
students, represented by the b subscript, rather than teachers. Second, liquation (A.6) includes a 
lagged measure of the dependent variable Y /illJ _ ( .which makes it a growth model of student 
achievement, sometimes referred to as a value added model. In other words, K ( can he thought of 
as the posttest, which depends on the pretest. Other variables in the model include teacher 

background characteristics X , student background characteristics IV h - , an indicator tor random 
assignment to the treatment group, and a set of district-by-gradc fixed effects which make up the 
vector of school level variables Z . . W’c allow the coefficient on the pretest A to van - by district- 
grade (that is, to van - by test). W’c substitute data for both math and reading test scores for the 
outcomes Y^ , and Y h/J _, . 

In liquations (A.3) and (A.6), the coefficient S for the treatment group indicator represents the 
impact of the receipt of comprehensive induction services and is the main parameter of interest. The 
standard error of this impact estimate accounts lor the design effects attributable to the clustering of 
teachers and students within schools, which occurs because teachers or students within schools tend 
to have similar outcomes. The school at the time of random assignment is always used when 
clustering teachers and students, even it teachers have changed schools. 

Fquations (A.3) and (A.6) can be thought of as mixed models or as hierarchical models. They 
are “mixed" because they contain fixed effects (represented by fiSftvAip) as well as random effects 
(represented by e and //). They arc hierarchical because they embed a school-level model (indexed by 
j) within an individual-level model (indexed by h or /). Several techniques arc available for estimating 
such a model, including ordinary least squares (015) with robust standard errors (sec Huber 1967; 
White 1980); generalized least squares (GLS) estimates of a random effects model; maximum 
likelihood; and restricted maximum likelihood. W e estimated the standard errors of the model by 
using each of these methods, but the findings did not change, except for one student achievement 
result noted in Chapter V. We report findings based on GLS estimates of a random effects model 
for classroom practices, teacher attitudes, and teacher retention. For student achievement models, in 
which we collect data from each school district, we use ordinary least squares with robust standard 
errors, which docs not require as many assumptions about the distribution of the error terms as a 
random effects model. 

A teacher background questionnaire, discussed in Chapter 111, provides a long list of potential 
explanatory variables for inclusion in the model (the X vector), including demographic and 
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household characteristics, information on teachers’ education and professional background, and 
teaching assignment. In addition, we include school-lcvd variables (the V. vector) from the Common 
Core of Data (CCD) of the National Center for Education Statistics.' For the student achievement 
analyses, districts provided student pretest scores ()’ ( ) and student demographic characteristics that 
could be included in the model (the IP" vector). 

We used a separate set of covariates for each type of outcome we analyzed. Table A.l presents 
the lists by analysis type. The analysis of classroom practices (Table V.l) used teacher personal 
characteristics, teacher professional characteristics, school characteristics, and district and grade fixed 
effects. The student achievement benchmark analyses (Tables V.2 and V.5) had normalized student 
pretest score, student background characteristics, teacher personal characteristics, teacher 
professional characteristics, anti district-by-grade fixed effects. The analysis of teacher attitudes 
(Tables D.l to D.8) had district and grade fixed effects and no other covariates. Finally, the 
benchmark teacher retention analysis (Tables D.ll and D.l 2) included teacher personal 
characteristics, teacher professional characteristics, teacher neighborhood characteristics, school 
characteristics, and district and grade fixed effects. 

We generate results for each district by regressing the outcome variable on the benchmark 
covariatcs, an indicator variable for random assignment of the school to the treatment group, and a 
set of interaction terms between the treatment indicator and each district. The district-specific 
impact is the coefficient on its interaction term. To derive these results, \vc pool one-year and nvo- 
ycar districts. See Figures B.l— B.4, C.4— (19, and D.2— D.5. An analogous method is used to derive 
gradc-by-gradc results for student achievement show r n in Tables V.3 and V.6. 

Instrumental Variable Estimation to Correct for Measurement Error. One feature of 
achievement growth models is that pretest is used as an explanatory variable. I lowcvcr, pretest score 
is an imperfect measure of prior ability, so measurement error in the pretest induces a bias in the 
pretest coefficient estimate and anything correlated with pretest. By design, treatment status, whose 
coefficient S is the main parameter of interest, is statistically independent of pretest, but even 
chance correlations can influence the impact estimate. In addition, the possibility that school 
principals of treatment teachers could have assigned different types of students to their teachers 
than principals of control teachers raises the possibility of a relationship between treatment and 
pretest. 

Therefore, as a specification check of the benchmark student achievement results, we estimated 
a regression model using the method of instrumental variables to adjust for the measurement error 
problem. An instrumental variable is one that explains variation in the variable measured with error 
>ut is not correlated with the final outcome. We used the opposite subject test score as an 
instrument in this exercise (math for reading and reading for math). We therefore estimate this 
system of equations, using M for math test score and R for the reading test score: 

M =a + 0R ' +r'X +r,T +r'Z +v. (A.7) 

M, VJ = /r 4 + /r X„ + <p'\V M 4. ST, * /' Z , * % (A.8) 


s? CCD data atv repotted with a lag; therefore, the school-live] information describes schools in one 

year hetore the study hegan. 
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Table A.1. Covariales Included in Impact Estimation Models by Analysis Type 

Andy 2 ** T ablcs Cty/ariates InckiCed in the Impact Estimation MoCel 

Teacher personal charactcnsbcs: 

. Age 

• Gender 

Teacher professoral charactenstoi 

• Route into teaching 

• Certification status 

• Highest degree 

• Months ol teaching experience 

• Grade level 
School characteristics: 

• Percentage of students eligible lo recene a Iree at rcCuceC-price 
lunch 

• Percentage cf students vho are white 
District fixed effects 
Grace fixed effects 


Student achievement 

V.2.V.3. V.4 (Ines 1.4 7. 11). 

Student characteristics: 

(benchmark model) 

V.S.V.6, V.7 (Ines 1, 4-7.11). 

• 

Gender 


C.1 1 

• 

Race'ethricity 



• 

Special education status 



• 

English- language learner status 



• 

Free'reduceC- price lunch status 



• 

Over age for grace 



Teacher personal charactenstics: 



• 

• 

rige 

Age squared 



• 

Gender 



• 

Race.'ethrlcity 



• 

Teacher race'ethricity matches hat of a majority of students 



Teacher professional characteristics 



• 

Route into teaching 



• 

Highest degree 



• 

Holds a degree in an education related field 



• 

First -year teacher 



• 

Hired after the school year began 



• 

Attended a competitive college 



• 

Held a nonteaching yob for five or more years 



Normalized pretest score 



District Oy graCe fixed effects 


Student characteristics: 

• Gender 

• Race'ethricity 

• Special education status 

• English-language learner status 

• Freo'reduceC-prioe lunch status 

• Over age for grace 

Normalized pretest score 
District by grace fixed effects 

Student achievement V.4 (tire 3). V.7 (line 3) Normalized pretest score 

(alternative model 2) DistrictOy grace fixed effects 


Student achie.en-.ent V.4 (line 2|. V.7 (line 2) 
(alternative model 1 ) 


Classroom practices 


V.I.C.1. C2.C.3. C.4 


A -4 
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Table A. I ( continued ) 


Analysis 

Sluderrl achevement 
(alternative model 3) 


Student achevement 
(alternative model 4) 


Tables 

V.4 (Unas 0-9). V.7 (lines 8-9) 


Covanatcs IncKiCed in (he Impact Estimation Model 

Student characteristics: 

• Gender 

• Race.'ethricity 

• Special education status 

• English-language learner status 

• Free'reduced-price lunch status 

• 0 <er age for grade 


Teacher personal characteristics: 

. Age 

• Age squared 

• Gender 

• Race'ethrlcity 

• Teacher race'cthmcily matches hat of a majority of students 
Teacher professional characterrstoi 

• R exile into teaching 

• Highest degree 

• Holds a degree in an edocaticn-relaled field 

• First-year teacher 

• Hired after the school year began 

• Attended a campeter«e college 

• Held a nanteaching pb for five or more years 


District by grace fixed effects 


V.4 (lire 10). V.7 (line 10) 


Teacher personal characteristics: 

. Age 

• Age squared 

• Gender 

• Race.'ethricily 

• Teacher race'cthmcily matches hat of a majority of students 
Teacher professional characteristics: 

• R exile into teaching 

• Highest degree 

• Helds a degree in an education related field 

• First-year teacher 

• Hired after the school year began 

• Attended a competitme college 

• Held a nanteaching pb for fire or more years 

Normalized pretest score 
District fixed effects 


Student characteristics: 

• Gender 

• Race'ethrlcity 

• Special education status 

• English -language learner status 

• Free'red iced- price lunch status 

• Over age for grade 


Teacher attitudes 


D.1.0 2. 0.3. D.4. 0.5. 0.6. 
D.7. 00 


D strict fixed effects 
Grade fixed effects 
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Table A. I ( continued ) 


Analysis 

Tables 


Covariates IncUCed in the Impact Estimation MoCel 

Teacher mobility 

D.9. 0.10.0.11. 0.12 

Teacher personal charactcnsbcs: 



• 

Age 



• 

Gender 



• 

Race'ethricity 



• 

Teacher race'ethricity matches hat of a ma|crity of students 



• 

Marital status 



• 

Teacher has chidren 



Teacher prrrfcssorjl charaotensta: 



• 

Mcnths of relevant teaming expenence 



• 

Certification status 



• 

Holds a degree in an edocalcrirclaed field 



• 

Hired after the school year began 



• 

Attended a competrtr.e college 



• 

Held a rvonteaching yob far fire or more years 



• 

Taught a snjlc grace level 



Teacher reghbothcoa characteristics 



• 

Comrojlirg distance 



• 

Teacher is a homeowner 



• 

Teacher lives in the school dstrict 



• 

Teacher atlencec an elementary school in which the 
soooeccnomic status of students was similar to the school taught 
in 



School characteristics: 



• 

Percentage erf students eligible to receive a free or reouceC price 
lunch 



• 

Percentage erf students who are white 



District fixed effects 



Grace fixed effects 

Teacher mobilty 

D.11.0.12 

District fixed effects 

(alternative model 1) 




Teacher mobilty 

D.11.0.12 

Teacher personal charactcnsbcs: 

(atemabve rrodel 2) 


• 

Age 



• 

Gender 



• 

Race'ethricity 



• 

Teacher race'ethricity matches hat of a majority of students 



• 

Teacher has chidren 



District fixed effects 



Grace fixed effects 


The first stage equation (A.7) regresses M 




on 


all of the other independent variables from 
the outcome equation (A. 8) plus an instrumental variable, the opposite-subject pretest, (that 

is, we use the reading pretest as an instrument for the math pretest and vice versa). In the second 
stage (outcome) equation (A.8), A/,. . , is replaced by its predicted value, which is generated from 
equation (A.7) by setting the error term V hi/ to zero. Instrumental variable results are reported on 
line 1 1 in the top and bottom panels of Tables V.7 and V.8. 
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Difference-in-Differences Analysis of the Change in the Treatment Effect for Student 
Achievement in Multiple Years. To measure the improvement of treatment teachers relative to 
control teachers from one study year to another, we employ a diffcrencc-in-diffcrenccs estimator. In 
particular, we compare the difference in student outcomes between treatment and control teachers 
in Year 3 to the corresponding differences in Year 2 and separately compare treatment/control 
differences in 't ear 3 and Year 1. Both comparisons follow the same method. We pool data on all 
students taught by the common sample of teachers in the data in both years and estimate the 
following model: 

= ft + + sj, + s ; (r, - c,„ ) + x, • V-. + • c», ) + fit ' + p : '(x u • c\, ) 

'IK, ’K, *Q,)+r, 'z, * r A z , *Q,)+K +«■*,] M 


In this model, the student posttest is regressed on an indicator variable for cohort (7. (the Year 1, 
Year 2, or Year 3 cohort of students), student pretest, teacher background characteristics, student 
background characteristics, district-bv-grade fixed effects, assignment to the treatment group 7 and 
the interaction of all variables in the model with cohort. 

Students in the Year 1 or Year 2 cohort are assigned weights in order to equalize the weight for 
a teacher in the earlier cohort and the Year 3 cohort. For example, if a teacher has 20 students in 
cohort two and 10 students in cohort three, each student in cohort two will receive a weight of 0.5 
so that the total weight for that teacher in cohort two is 10 (since 20*0.5 — 10). Conversely, for a 
teacher with 10 students in cohort two and 20 students in cohort three, each student in cohort two 
receives a weight of 2. 

The key parameter of interest is <5?, which estimates the effect of the interaction of treatment 
status and cohort. This parameter estimates the difference of the treatment/ control contrast in the 
teacher effect on student test scores between Year 1 and Y ear 3 (or between Y ear 2 and Y ear .3). We 
use robust standard errors to account for correlation in outcomes for students clustered within 
schools. 


Non-Experimental Analysis. Chapter VII presents findings from non-expcrimcntal analyses 
that arc very similar in structure to the experimental analyses. Those analyses arc based on Equations 
(A. 3) and (A.6), except that we replace the treatment status indicator with a vector of variables that 
are indices describing the level or intensity of teacher induction services reported by the teacher. 
Alternately, we replace the treatment status indicator with a single index. The result, presented in 
Equation (A. 9), is an extension of the retention analysis. The student achievement model (not 
shown) is analogous. 


Y^ti+O'Q+P'X^y'Z^ [«,+*] 


(A- 1 0) 


where we replace 7, the indicator variable for assignment to the treatment group in Equadon (A.3), 
with Q u , representing a vector of indices (or a single index) describing the breadth, intensity, or 


w Due to small sample size, we do not present estimates of a model in which the sample is composed of teachers 
common to all three cohorts. 
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nature of induction services, liach coefficient in the 0 vector captures the relationship between an 
induction index and the outcome Y. The same vector of .V and Z variables used in the experimental 
section is used in the corresponding nonexperimental analysis unless specified otherwise. The 
psychometric properties of the indices arc presented in Table A. 5. 

B. Analysis Weights 

Most analyses in the report use weights that accounted for two aspects of the study design. One 
is nonresponse to the surveys, and the other is the unequal probability across districts of a teacher 
being in the treatment group. This appendix explains the nature of these problems and how weights 
were used to address them. 


The response rates for this study’s surveys exceeded the targets set in the study design, but we 
did observe statistically significant differences between treatment and control groups. A concern 
with diflerential response rates is that, if nonresponse is not random with respect to outcomes, the 
degree to which nonresponse affects the average outcomes will differ by treatment status, and the 
impact estimates — which arc differences in mean outcomes for respondents only — will be biased. If, 
for example, nonrespondents have worse outcomes than respondents, then we would expect the 
lower response rates for the control group to translate into an upwardly biased estimate of the 
countcrfactual outcome and therefore a downwardly biased estimate of the impact. 


To mitigate the potential bias, we constructed nonresponse adjustment weights, calculated 
separately for each data collection instrument as follows. First, we used a logistic regression model to 
estimate the relationship between the likelihood of responding to the survey and the baseline 
variables, such as the teacher’s age, level of education, and preparation route. We estimated separate 
prediction models for the treatment and control groups. 'Then we computed the weight as the 
inverse of the predicted probability’ of responding. This procedure is equivalent to letting the 
respondents in each treatment group who look most like nonrespondents carry a greater weight so 
that they can stand in for their missing counterparts. We used these weights in all impact estimations 
with teacher outcomes, although the weights did not substantially change the findings. 


We made one adjustment to the weights to deal with potential confounding of district 
characteristics with treatment status. As with most multisite studies, the probability of assignment to 
treatment was not identical across districts. Therefore, we tailored the random assignment procedure 
slightly to each district based on {1) the number of schools that the district contributed to the study 
and (2) the cluster size (number of eligible teachers per school), resulting in some variation in the 
ratio of treatment to control teachers. Thus, when we report averages based on data pooled across 
districts, we must use weights to account for differential treatment-control ratios; otherwise, the 
treatment-control comparisons for the lull study would confound treatment differences with site 
differences. For example, if we had assigned 60 percent of the teachers to the treatment group in an 
extremely low-income district and 50 percent of the teachers to the treatment group in all other 
districts, the low-income students would be overrepresented in the overall treatment group, even 
though random assignment produced equivalent groups within each district. To correct for such 
overrepresentation, we divided the weights described above by the number of observations in each 
treatment group within each site and multiplied by the average number of observations in the two 
treatment groups in the district. The result is Hquation (A. 10): 


HEIGHT^ x ( 1 / />,)* — 

n 


1 + 
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where i indexes teachers, k indexes districts, and m indexes experimental group (treatment or 
control). The term p i represents the predicted probability of teacher / being a respondent. 

We developed enhanced weights for use with follow-up surveys to take advantage of the 
detailed list of background variables available from the background (baseline) survey. The enhanced 
weights made no difference in the estimates; therefore, we did not use them in the benchmark 
analyses presented in this report. 

C. Outcome Variables 

1. Classroom Practices 

Classroom observers were trained to use the Diagnostic Classroom Observation (DCO) 
protocol to assess instruction practices. The DCO, originally called the Vermont Classroom 
Observation Tool (VCOT), is a proprietary tool for classroom observations developed by the 
Vermont Institutes (sec Saginor anti Hyjck 2005; Saginor 2008). Researchers who first worked with 
Science anti Math Program Improvement (SA.MP1), a research group at Western Michigan 
University, developed the tool over several years. SAMPI had developed an instrument to measure 
the quality of standards-based, investigative science and mathematics instruction based on research 
conducted by I lorizon Research, Inc. 

In developing the DCO, the Vermont Institutes staff used the SAMPI Observation Tool as a 
starting point and carefully reviewed Charlotte Danielson’s Framcw'ork for Teaching (1996), on 
which the widely used Praxis 111 observational assessment (and ETS induction program) is based 
(Dwyer 1994). In parallel with the Praxis 111 content, the DCO developers included examples of 
evidence for each indicator, added systematic and ongoing formative and summative assessment of 
student learning as a major indicator, and simplified and shortened the tool. The tool underwent 
further refinement through its use in the field by a group of trained teacher-leaders who observed 
classrooms. In 2004, several of those involved in the original design of the tool adapted it for use in 
the observation of literacy lessons. The standards and practices included in the National Council of 
Teachers of English (NCTE) Standards and the National Reading Panel (NICHHD 2000) also 
helped inform development of the literacy version of the DCO. 

The DCO describes teaching practices in four areas: 

1 . Planning and Organization of a Lesson 

2. Implementation of a Lesson 

3. Content of a Lesson 

4. Classroom Culture 

In this study, we attempted to measure all but the first construct, lesson planning and 
organization. The procedure for assessing lesson planning and organization is more suited for 
individual teacher feedback than for research and requires measurement of activities before the start 
of a lesson and a separate teacher interview of varying length and content. 

Implementing the DCO. Staff from the Vermont Institutes trained the classroom observers. 
Much of the training relied on videotaped classes, hut it also included practice observations 
conducted in pairs in “live" school settings. During the practice observations, observers scored 
independently and then debriefed to reach consensus on any individual items for which the 
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discrepancy exceeded a single point. In addition to practice observations, observers participated in 
training for a total of nine day s over the course of three training sessions. 

After observ ing and scoring a videotaped class, observers were deemed “certified” to conduct 
the observations based on a comparison of their 16-item scores to the observations of a “gold 
standard” panel. The gold standard panel consisted of the tool’s developer and two trained 
observers who demonstrated a clear understanding of the items measured in the tend and showed 
high rates of agreement in scoring. Trainees had two opportunities to come within 0.75 points of the 
gold standard average score for the three constructs (implementation, content, and culture) during a 
test observation. Trainees who did not meet the standard were not allowed to conduct observations. 
To address the possibility that observers’ scoring would start to drift in one direction or another 
after conducting observations, we asked the tool developer to observe a classroom with each 
observer in the field at least once to verify scoring after each observer had completed several 
observations. As mentioned in Chapter 111, observers were always blind to teachers' treatment status 
and therefore did not know if they were observing someone who had received the comprehensive 
induction support. 

Interpreting DCO Scores. To summarize the information from the classroom observations 
across all 16 indicators, we produced three scores corresponding to the three domains captured by 
the observation protocol (into which the items had already been grouped). We performed a factor 
analysis of the 16 classroom observation items to explore the degree to which the theoretical 
groupings were empirically justified. In finding the groupings justified, we maintained the three- 
construct scoring method (implementation of literacy lesson, content of literacy lesson, and 
classroom culture) described above. Factor loadings for the 16 class observation items arc shown in 
Table A. 2. Psychometric details for each construct arc presented in Table A. 5. 
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Table A.2. DCO Classroom Practices Constructs: Factor Loadings 


Factor Loading 


Variable 

1 

2 

Literacy Implementation 

Best practices 

0.808 

0.364 

Institutional choices 

0.719 

0.509 

Student choices 

0.805 

0.241 

Pace 

0.595 

0.581 

Literacy Content 

Understanding content and close reading 

0.756 

0.321 

Assessment 

0.473 

0.275 

Skill development 

0.784 

0.332 

Connections between reading and writing 

0.771 

0.138 

Literacy Classroom Culture 

k4axim(zes learning opportunities 

0.315 

0.868 

Routines dear and consistent 

0.256 

0.817 

Respectful behavior, safe atmosphere 

0.278 

0.867 

Literacy valued 

0.644 

0.439 

Teacher works collaboratively with students 

0.536 

0.652 

Students work collaboratively with students 

0.458 

0.654 

Equal access to teacher and resources 

0.285 

0.776 


Source: Mathematics classroom observations conducted in spring 2006. 

Note: Data pertain to teachers in all districts participating in the study. The extraction method was principal 

components analysis, and the rotation method was varimax with Kaiser normalization. 

N = 631 teachers. 

The estimated impacts on classroom practices described in Chapter V can be better underst<x»d 
by relating the DCO scores to student test scores. \X'e conducted correlational analyses to explore 
whether there is a relationship between student achievement gains and DCO scores. After fielding 
the DCO for this study and comparing the results with student achievement gains, we found that 
the association between each of the three classroom practices indices and student test scores was 
positive and statistically significant (regression coefficients = 0.065 to 0.085, p-values = 0.001 to 
0.024.) Because the test scores have been standardized to have a mean of zero and a standard 
deviation of one, the magnitude of each estimate can be interpreted as an effect size. That is, a one- 
unit change in a DCO score, for example from "limited evidence" to “moderate evidence," is 
associated with an increase in reading test scores of between 6.5 percent anti 8.5 percent of a 
standard deviation. 

2. Teacher Attitude Measures 

Using items from the induction activities surveys, we measured teachers’ feelings of satisfaction 
in 19 areas (such as satisfaction with their workload) and preparedness in 13 areas (such as 
preparedness to work with students with special challenges). The surveys asked teachers to respond 
along a four-point scale (ranging from “very dissatisfied” to “very satisfied" and from “not at all 
prepared" to “very well prepared”). For both satisfaction and preparedness, we conducted a factor 
analysis on tall 2005 data to explore how items could be sensibly grouped together. The factor 
analyses suggested that teacher satisfaction consisted of satisfaction with (1) school, (2) class, and 
(.3) career, and teacher preparedness consisted of preparedness to (1) instruct, (2) work with 
students, and (3) work with others. \X e used these domains to summarize the data. Factor loadings 
for the teacher satisfaction items arc shown in Table A. 3 and for teacher preparedness items in 
Table A.4. Psychometric properties for each scale are given in Table A. 5. 
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Table A.3. Teacher Satisfaction Constructs: Factor Loadings 




Factor Loading 


Variable 

1 

2 

3 

Satisfaction with School 

Support from administration for beginning teachers 

0.757 

0.330 

0.043 

Availability of resources and materialsi'equipment for your classroom 

0.576 

0.264 

0.153 

Input into school policies and practices 

0.665 

0.296 

0.202 

Opportunities for professional development 

0.473 

0.250 

0.338 

Principals' leadership and vision 

0.765 

0.281 

0.015 

Professional caliber of colleagues 

0.709 

0.046 

0.251 

Supportive atmosphere among faculty/collaboraton with colleagues 

0.728 

0.075 

0.191 

School facilities such as the budding or grounds 

0.557 

0.215 

0.141 

School policies 

0.631 

0.449 

0.183 

Satisfaction with Class 

Autonomy or control over own classroom 

0.397 

0.551 

0.038 

Student motivation to learn 

0.194 

0.736 

0.194 

Student discipline and behavior 

0.167 

0.795 

0.177 

Parental involvement in the school 

0.210 

0.498 

0.336 

Grade assignment 

0.239 

0.558 

-0.021 

Students assigned 

0.156 

0.734 

0.143 

Satisfaction with Teaching Career 

Salary and benefits 

0.035 

0.008 

0.851 

Professional prestige 

0.425 

0.271 

0.623 

Intellectual challenge 

0.414 

0.346 

0.460 

Workload 

0.313 

0.386 

0.475 


Source: Mathematica First Induction Activities Survey administered in fall 2005 to al study teachers. 

Note: Data pertain to teachers in all districts participating in the study. Emphasis on standardized test scores 

was not included in factor analyses or subscales. The extraction method was principal components 
analysis, and the rotation method was varimax with Kaiser normalization. 

N = 889 teachers. 


3. Test Score Data 


Aggregation of Test Scores Across Grades, Subjects, and Districts. Districts and even 
grades within some districts varied with respect to types of tests administered. Aggregating test 
scores across different tests posed a challenge for the analysis because it is important that 
achievement be measured in a common metric in order to combine the results across districts and 
grades. In anticipation of this problem, we designed the random assignment of schools to yield an 
approximately even mix of teachers in the treatment and control groups by grade level within 
district. Therefore, treatment-control comparisons within any grade level and district became 
“applcs-to-applcs” comparisons. Instead of trying to compare different increments of learning 
across grades and districts, this approach only requires combining treatment-control differences 
(impact estimates) from all district-grade combinations to a single number in order to summarize the 
findings and draw on as large a sample as possible. 
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Table A.4. Teacher Preparedness Constructs: Factor Loadings 



Factor Load 

ing 

Variable 

1 

2 

3 

Prepared to Instruct 




Managing classroom activities, transitons. and routines 

0.677 

0.397 

0.045 

Using a variety of instructional methods 

0.747 

0.182 

0.225 

Assessing your students 

0.621 

0.211 

0.399 

Selecting and adapting curriculum and instructional materials 

0.690 

0.154 

0.345 

Planning effective lessons 

0.644 

0.148 

0.497 

Being an effective teacher 

0.693 

0.340 

0.298 

Addressing the needs of a diversity of learners 

0.621 

0.337 

0.292 

Prepared to Work with Students 




Handling a range of classroom behavior or discipline situations 

0.573 

0.599 

0.001 

Motivating students 

0.448 

0.604 

0.133 

Working effectively with parents 

0.077 

0.725 

0.447 

Working with students wfto have special behavioral, emotional, 
developmental, or physical challenges 

0.264 

0.691 

0.226 

Prepared to Work with Other School Staff 




Working with other teachers to plan instruction 

0.268 

0.166 

0.809 

Working with the principal or other instructional leaders 

0.282 

0.287 

0.779 


Source: Mathematica First Induction Activities Survey administered in fall 2005 to al study teachers. 

Note: Data pertain to teachers in all districts participating in the study. The following items were not included 

in factor analyses or subscales: teaching readmgi’language arts, teaching mathematics, and working 
with English-language learners. The extraction method was principal components analysis, and the 
rotation method was varimax with Kaiser normalization. 

N = 895 teachers. 

To facilitate aggregation by grade and district, we converted all test scores to a common metric 
called a z-scorc, which is obtained by subtracting the mean from each value and dividing by the 
standard deviation of a “universe” of test-takers, approximated by a state or national norm group. 
The resulting score can be interpreted as the distance from the average score as a fraction of a 
standard deviation difference from the average for the reference group; therefore, a z-scorc of -0.5, 
for example, means that the score was one-half of a standard deviation below' the state or national 
mean. We used the mean and standard deviation of the norm sample for each test as published by 
state agencies or test developers. 

As an example, consider the hypothetical case where we compare the gains for a fourth-grade 
teacher named Ms. Smith in Seattle with those of a fifth-grade teacher named Mr. Cone in 
Cleveland. Assume that Ms. Smith’s students scored at the average level for Seattle third graders in 
the pretest year and 10 percent of a standard deviation above the fourth-grade average at the end of 
the posttest year on a Washington State math assessment. Also assume that students in Mr. Cone’s 
class in Cleveland who performed at one-half of a standard deviation above the mean at the end of 
grade four on Ohio’s state math assessment subsequently scored 0.6 of a standard deviation at the 
end of grade five. These would be considered equivalent, as both sets of students moved up onc- 


w Seattle and Cleveland an- listed as hypothetical examples. They an- not in the study. 
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tenth of a standard dev iation relative to their local reference groups on their own state’s assessment 
(0.1 - 0.0 = 0.6 - 0.5). 

It is also possible to aggregate by subject matter. We kept two broad subject areas distinct — 
math and reading (which includes Knglish/languagc arts) — and we present the findings separately 
for those two subjects. W e dropped reading z-scorcs from two districts because the tests were 
scored within the district. We excluded other subjects from the main impact analysis, such as foreign 
languages, social studies, or science, which arc not available in enough districts to yield meaningful 
findings. Psychometric properties of the test score measures arc given in Table A.5. 

Missing Data. Not every student that a teacher was responsible for during the year had a valid, 
usable test score for the analysis. For example, students might have been exempt from testing, be 
missing a test score because of repeated absence, not have been enrolled during the test period, or 
have repeated the grade in which they were enrolled in 2007— 2<X)8. In districts for which these data 
were available, we did not use pretest scores for students whose grade in the 2(K16— 2007 school year 
matched their grade in the 2007—2008 school year. These students were presumed to be grade 
repeaters, meaning that the pretest score was given on a different scale than that of the main group 
of students. 

In addition, we excluded students from a subject if their test scores were out of range for the 
test for that district, grade, and subject, as this was evidence that the test score provided was taken 
from an alternative assessment. Of the students with a valid reading posttest in district-grade 
combinations included in the benchmark model, 88 percent of treatment students had a pretest 
score compared to 85 percent of control students, a difference of almost 4 percentage points (after 
accounting for rounding). For students with a valid math posttest in district-grade combinations 
included in the benchmark model, 86 percent of treatment students had a pretest score compared to 
84 percent of control students, a difference of 2 percentage points. We assumed that the data were 
missing at random. Finally, we dropped teachers whose average student gain scores were greater 
than 1.5 standard deviations above or below the mean for the reference group (state or norm 
sample). This resulted in the loss of one classroom, whose students had gains in math scores that 
would have placed most of them in the 94th percentile or above for the state, even though the same 
students' gains in reading were slightly below average. 

Restrictions. Based on the data provided by school districts, we excluded some students from 
the model if it appeared implausible based on objective criteria that the teacher linked to them was 
their lull-time teacher for one or both subjects. Our first step was to check the second Teacher 
Mobility Survey, conducted in fall 2007, to ensure that all teachers in the test score data claimed to 
teach reading, math, or special education/ resource in response to a survey question about teaching 
assignment. Furthermore, we checked this survey to ensure that there were no mismatches between 
the grade of the majority of students linked to the teacher and the gradc(s) the teacher claimed to 
have taught on this survey. In this way, we excluded two teachers, both of whom were linked to 
more than 100 students because they taught physical cducation/hcalth. 

Even after excluding these teachers, the test score samples included some teachers who were 
linked to an implausibly high or low number of students to be a regular classroom teacher. Two 
districts provided the subjects taught by teachers with student test score data. To help determine 
teaching assignments in other districts, we followed up with teachers whose responses on the second 
Teacher Mobility Survey were outliers, especially teachers who were linked to 30 or more students or 
to 12 or fewer students. VC’c asked these teachers to verify their teaching assignments by giving the 
number of students for whom they were principally responsible for math and reading outcomes in 
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the 2006—2007 school year and the 2007—2008 school year. W e compared their responses to the data 
we had been given by the district and, in some cases, clarified discrepancies by querying the school 
district. Of those identified, we successfully contacted 24 of 46 treatment teachers (52 percent) and 
18 of 43 of control teachers (42 percent). 

Based in part on their responses, we used these criteria to restrict the sample for the test score 
analysis: 


• If a teacher was linked to 30 or more students and indicated on the Teacher Background 
Survey that she was not responsible for reading or math outcomes, students were 
excluded from whatever subject the teacher said she did not teach. This affected 
5 teachers and 223 students in reading and 3 teachers and 144 students in math. 

• W'c excluded all students from a teacher who was linked to 40 or more students and 
responded on the Teacher Background Survey that she was responsible for both reading 
and math outcomes on the background survey. This affected 4 teachers, 329 students in 
reading, and 330 students in math. Because treatment teachers were more likely to 
confirm their teaching assignments than control teachers, we applied the four restriction 
rules to all teachers, even if we learned during the followup interview that they were an 
exception to these general rules. In this way, the application of the restrictions was not 
confounded with treatment status. As part of the sensitivity analysis, however, we also 
estimated the model (1) based on a sample that used general rules for teachers whom we 
were unable to contact and particular information from teachers’ responses if it 
contradicted the general rules; and (2) based on a sample that did not impose the two 
rules. 

The test score analysis pertains only to teachers in tested grades and subjects. Because students 
were included in the benchmark model only if they had a valid pretest from the prior year, we 
excluded the youngest grade at which testing began in a district. For example, in districts that test 
students, as mandated by the No Child I. eft Behind Act, in grades three through eight, and operate 
elementary schools that enroll students in kindergarten through grade five (the most common case), 
we were able to estimate impacts on achievement for grades four and five only. As part of the 
sensitivity analyses, we excluded the pretest covariatc from the analysis and thus were able to 
consider more grades and include more students in the analysis. 
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Table A.5. Psychometric Properties of Measures 


Number 



Minimum 

Maximum 

Sample 

Cronbach's 

Outcome of Items Mean 

Median 

SO 

Value 

Value 

Siie 

Alpha 

Teacher Satisfaction 

Satisfacton with career 








Fall 2005 * 

l 3.01 

3.00 

0.60 

1.00 

4.00 

889 

0.77 

Sonng 2006 * 

1 2.91 

3.00 

0.63 

1.00 

4.00 

876 

0.78 

Fall 2006 < 

1 2.96 

3.00 

0.63 

1.00 

4.00 

831 

0.73 

Sonng 2007 * 

1 287 

3.00 

0.66 

1.00 

4.CO 

370 

0.78 

Fall 2007 * 

1 288 

3.00 

0.63 

1.00 

4.00 

752 

0.72 

Fal 2006 * 

1 287 

3.00 

0.71 

1.00 

4.00 

715 

0.90 

Satisfaetkm with class 








Fall 2005 ( 

1 3.05 

3.17 

0.61 

1.00 

4. CO 

889 

084 

Sonng 2006 » 

1 2.99 

3.00 

0.64 

1.00 

4.00 

876 

085 

Fall 2006 t 

* 3.14 

3.17 

0 58 

1.17 

4.CO 

832 

0.78 

Sonng 2007 ( 

i 3.09 

3.17 

0.59 

1.00 

4.00 

370 

082 

Fall 2007 * 

I 3.11 

3.17 

0.59 

1.00 

4.00 

749 

0.77 

Fal 2006 f 

1 3.15 

3.17 

0 59 

1.00 

4. CO 

715 

0.93 

Satisfaction with school 








Fall 2005 < 

1 3.10 

3.11 

0.63 

1.00 

4. CO 

889 

0.90 

Sonng 2006 ‘ 

1 2.99 

3.00 

066 

1.00 

4.CO 

876 

0.91 

Fall 2006 ‘ 

i 3.14 

3.22 

0.59 

1.11 

4. CO 

832 

088 

Sonng 2007 ‘ 

i 2.99 

3.00 

0.64 

1.00 

4.00 

370 

089 

Fall 2007 ‘ 

1 3.10 

3.11 

0.59 

1.11 

4.00 

745 

087 

Fal 2008 < 

1 3.11 

3.22 

0.62 

1.11 

4. CO 

716 

096 

Teacher Preparedness 

Preparedness Id nstrucl 








Fall 2005 

’ 280 

2.86 

056 

1.00 

4. CO 

895 

0.90 

Sonng 2006 

r 2.96 

3.00 

056 

1.00 

4.00 

876 

0.92 

Sonng 2007 1 

' 3.14 

3.00 

054 

1.00 

4.00 

371 

0.90 

Fal 2008 

r 346 

3.57 

048 

1.71 

4.00 

694 

098 

Prepared re ss Id week »ilh ethers 








Fall 2005 i 

f 288 

3.00 

0.74 

1.00 

4. CO 

695 

082 

Sonng 20C6 i 

f 2.96 

3.00 

0.71 

1.00 

4.00 

874 

082 

Sonng 2007 i 

f 3.12 

3.00 

068 

1.00 

4.CO 

371 

0.73 

Fal 2008 : 

1 381 

3.50 

0.62 

1.00 

4.00 

694 

0.94 

Preparedness Id ao* with students 








Fall 2005 < 

1 2.73 

2.75 

0.59 

1.00 

4.00 

895 

0.78 

Sonng 2006 ‘ 

l 284 

2.75 

0.61 

1.00 

4.00 

876 

083 

Sonng 2007 * 

1 2.99 

3.00 

0.57 

1.00 

4. CO 

371 

0.75 

Fal 2008 r 

l 325 

3.25 

0.55 

1.25 

4.00 

694 

0.95 

Classroom Practices 

Implementation of literacy lessen 

> 2.68 

2.60 

0.84 

1.00 

480 

631 

089 

Coolenl of literacy lesson ‘ 

1 2.37 

2.25 

0.79 

1.00 

5.00 

631 

080 

Cljssrccm culture 

r 3.06 

3.14 

087 

1.00 

5.00 

631 

0.93 

Student Achievement 

Reading pastiest scores 2008 

•021 

•0.15 

0.93 

•4.40 

3.40 

3.037 

nfa 

Math poshest scores 2008 

•0.13 

•0.12 

1.00 

•4.44 

3.47 

2.827 

nfa 

Induction Suoport 

Fill sample of teachers 

Years BT had an assigned mentor 

t 123 

1.00 

062 

000 

2.01 

901 

nfa 

Induction Breadth Index ‘ 

1 183 

2.00 

0.70 

0.00 

3.00 

969 

nfa 

Fall 2005 C 

1 






028 

Spnng 2006 C 

1 






0.38 

Fal 2006 C 

1 






0.33 
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Table A.5 ( continued ) 



Number 




Minimum 

Maximum 

Sample 

Cronbach's 

Outcome 

of Items 

Mean 

Median 

SD 

Value 

Value 

Size 

Alpha 

Instructional Focus Index 

8 

1.55 

1.50 

0.57 

0.00 

250 

965 

n la 

Fail 2005 

3 







0.52 

Spnng 2006 

3 







0.54 

Fal 2006 

2 







029 

Inunction Intensity index 

10 

1.44 

1.08 

1.43 

0.00 

2261 

965 

nla 

Fail 20C6 

4 







028 

Spnng 20C6 

4 







028 

Fal 2006 

2 







0.42 

Sample of teachers n student math 
test scores analyses 

Years BT had an assigned mentor 

3 

1.19 

1.00 

0.61 

0.00 

201 

171 

n'a 

Induction Breadth Index 

9 

1.92 

2.CO 

0.65 

0.00 

3.00 

172 

n'a 

Fall 20C6 

3 







026 

Spnng 20C6 

3 







041 

Fal 2006 

3 







0.35 

Instructional Focus Index 

8 

1.50 

1.60 

0.54 

0.00 

250 

172 

n'a 

Fall 2006 

3 







0.55 

Spnng 2006 

3 







0.33 

Fal 2006 

2 







025 

Induction Intensity Index 

10 

1.57 

1.17 

2.03 

0.00 

2261 

172 

nfa 

Fail 2005 

4 







022 

Spnrvg 2006 

4 







0.30 

Fal 2006 

2 







0.46 

Sample of teachers n student 
reading test scores andyscs 

Years BT had an assigned mentor 

3 

121 

1.00 

0.62 

ooo 

201 

175 

n'a 

Induction Breadth Index 

9 

1.92 

2.00 

0.67 

0.00 

3.00 

177 

nfa 

Fall 20C6 

3 







034 

Spnng 2006 

3 







0.44 

Fal 2006 

3 







0.33 

Instructional Focus Index 

8 

1.57 

1.50 

056 

0.00 

250 

178 

n'a 

Fall 2006 

3 







0.57 

Spnng 2006 

3 







0.40 

Fal 2006 

2 







027 

Induction Intensity Index 

10 

1.63 

1.23 

2.01 

ooo 

2261 

178 

nfa 

Fall 2005 

4 







021 

Spnng 2006 

4 







0.32 

Fal 2006 

2 







0.46 


Source. Mathematics analysis using data from the 2007-2008 scfiocrf year provided by particpaling school districts; 

Mathematics First. Second, and Third Induction Activities Surveys administered in fall 2005. spring 2006. and fall 
2006 to ad study teachers. Malhematioa classroom observations conducted in spring 2005 

Note Cronbach's alpha was calculated separately for variables within each time point for the Induction Breadth Index. 

Instructional Focus Index, and Inaction Intensity Index. Due to missing data, some values for induction support 
have been imputed, which can cause maximum values for a particular variable to exceed the value associated with 
full-time induction support. 

BT • beginning teacher; nfa • not applicable. 
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D. Supplementary Table for Chapter III 

Chapter III presented information on the survey instruments and other methods by which we 
collected data. Table A.6 provides sample size information for Table III. 2. 


Table A.6. Sample Sizes for Table 111.2. Response Rates to Teacher Surveys by Subgroup and Treatment 
Status 




Number of Teachers 


Treatment 

Control 

District Type (Years of Implementation) 

One year 

275 

286 

Two year 

231 

217 

Grade Level 

K or pre-K 

80 

72 

1 

73 

71 

2 

84 

78 

3 

81 

57 

4 

60 

60 

5 

46 

52 

Other.'multiple 

82 

113 

School Type (Percent in Free Lunch Program) 

Unknown 

30 

29 

0-49.9% 

37 

29 

50-74.9% 

98 

128 

75-100% 

341 

317 


Source: Mathemabca teacher induction survey management system: Mathematics Teacher Background Survey 

(fal 2005). Induction Activities'Teacher Mobility Surveys (fall 2006 and 2007) administered to all study 
teachers; Induction Activities Survey (spring 2007) administered to teachers in two-year districts. 

Note: The Induction Activities Survey and Mobility Survey were administered together in fall 2006 and fall 

2007. 

n'a = not applicable. 
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APPENDIX B 

SUPPLEMENTAL INFORMATION FOR CHAPTER IV 

This appendix presents estimates of the receipt of induction activities as reported by teachers. 
First, tables show treatment and control group means and service contrast estimates using data from 
each induction activities survey: fall 2005, spring 2006, fall 2006, spring 2007 (for teachers in two- 
year districts), hill 2007, and fall 2008. Estimates for one-year and two-year districts arc presented 
separately. Tables B.1-B.15 pertain to one-year districts anti Tables B. 16-B.30 pertain to two-year 
districts. The figures of service contrast estimates shown in Chapter IV arc based on the estimates 
shown in these tables. 

Second, figures show treatment-control differences in total minutes spent in mentoring by 
district. I Estimates arc shown for tall 2005 and fall 2006 and for one-year and two-year districts 
separately. Figures B.l and B.2 pertain to one-vear districts and Figures B.3 and B.4 pertain to two- 
year districts. Within each figure, the districts arc ordered according to the size of the difference. 
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Table B.l. Teacher Reports on Professional Support and Duties (Percentages). Fall 2005 and Fall 2006: One-Year Districts 



Fall 2005 



Fall 2006 


Treatment 

Control Difference 

P-value 

Treatment 

Control Difference 

P-value 


BT has mentor 

93.1 

77.5 

15.6" 

0.000 

24.5 

37.7 

-13.2" 

0.003 

BT has assigned mentor 

89.8 

69.9 

20.0" 

0.000 

19.7 

29.2 

-9.5" 

0.017 

Sample Size (Teachers) 

258 

245 

503 


241 

231 

472 



Source: Mathematics First and Third Induction Activities Surveys administered in fal 2005 and fall 2006 to all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Data are weighted and regression adjusted using ordinary least squares to 

account for differences in districts, teacher grade assignments, study design, and the clustering of teachers vdthin schools. Sample sizes vary due to 
item nonresponse. 

"Significantly different from zero at the 0.05 level. 

BT = beginning teacher. 
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Table B.2. Impacts on Teacher-Reported Mentor Profiles (Percentages). Fall 2005 and Fall 2006: One-Year Districts 




Fall 2005 



Fall 2006 


Mentoring Characteristic 

Treatment Control 

Difference 

P-value 

Treatment 

Control 

Difference 

P-value 

Number of Mentors 

Multiple Mentors 

25.4 

14.6 

10.8* 

0.006 

5.9 

9.7 

-3.8 

0.106 

Number of Mentors 

None 

6.9 

22.5 

-15.6' 

0.000 

75.5 

62.3 

13.2’ 

0.003 

One 

67.7 

62.9 

4.8 

0.333 

18.6 

28.0 

-9.4" 

0.021 

Two 

20.9 

8.4 

12.5* 

0.000 

5.9 

9.7 

-3.8 

0.106 

Number of Mentors Assigned 

None 

10.1 

30.1 

-20.0* 

0.000 

80.3 

70.8 

9.5’ 

0.017 

One 

71.0 

62.6 

8.4 

0.093 

18.3 

23.5 

-5.2 

0.186 

Two 

18.9 

7.3 

11.6* 

0.001 

1.5 

5.8 

-1.3' 

0.010 

Mentor Positions 

Positions of AH Mentors 

FuB-time mentor 

73.7 

7.5 

66.3* 

0.000 

1.5 

3.7 

-2.2 

0.201 

Teacher 

24.5 

63.8 

-39.3* 

0.000 

20.8 

30.7 

-9.9- 

0.014 

School or district administrator or staff 

10.5 

9.1 

1.4 

0.575 

2.9 

4.2 

-1.3 

0.379 

external to district 

No mentor 

6.9 

22.5 

-15.6- 

0.000 

75.5 

62.3 

13.2’ 

0.003 

Sample Size (Teachers) 

258 245 

503 


241 

231 

472 



Source: Mathematca First and Third Induction Activities Surveys administered in faB 2005 and fal 2006 to all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Data are weighted and regression adjusted using ordinary least squares to 

account for differences in districts, teacher grade assignments, study design, and the clustering of teachers within schools. Sample sizes vary due to 
item nonresponse. 

•Significantly different from zero at the 0.05 level. 
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Table B.3. Impacts on Teacher-Reported Mentor Services Received in the Most Recent Full Week of Teaching. Fall 2005 and Fall 2006: One-Year 
Districts 


Fat 2(05 Fal 2CC6 


Eftod Effect 


Mentor Serve c 

Troalmerr. 

Cortlrd 

Dltlcrencc 

Size' 

P- value 

Trcoonenl 

Control 

OMercrec 

Size" 

P-valuc 

•Usual' Moalras wlh Mentors 

Frequency |r«iiTe>or cf meclingsl 

1.3 

12 

0.1 

0.03 

0.730 

0.3 

0.7 

•03- 

023 

0.013 

Average deration (mr«J!os| 

23.2 

9.9 

13.3* 

0.74 

0.00 

2.3 

4.3 

-21* 

■023 

0.014 

Total tire‘ (mnutesl 

564 

33.3 

23.1 • 

0.36 

0.00 

99 

18.4 

•84' 

■020 

0043 

Mtormd f/cccros v-th Mentors 

Toed time (mnuesi 

30.4 

33.4 

-3.0 

•0.08 

0.372 

92 

20.1 

.109* 

0.33 

0001 

Total Usual and Informal Time with Mentors (Mlnutosl 

86.8 

687 

20.0* 

024 

0.007 

19.1 

38.3 

•194' 

030 

0.002 

Meeting Tine nth Mentos in the Fc4h«r.g Postinns I M retest 

FutlUme mentor 

60.3 

4.2 

36.2* 

0.99 

0.00 

06 

2.6 

•20 

0.17 

0.109 

Teacher 

23.0 

59.2 

-36.2* 

■046 

0.00 

16.6 

32.6 

-139* 

026 

0009 

Admmatrator 

4.1 

20 

2.1 

0.13 

0.143 

0.3 

2.3 

-20* 

023 

0028 

Slatt cxlemd todslnc 

1.4 

1.4 

0.0 

0.00 

0.976 

1.1 

0.0 

1.1 

0.13 

0.161 

Mentor Time in the Folowno ActMtlos (Minutes) 

Observing BT teaching 

335 

10.0 

235' 

0.73 

0.00 

2.3 

5.7 

-3-3* 

•022 

0.021 

Mecbng wth BT one -on one 

344 

227 

11.7* 

0.38 

0.00 

6 1 

10.1 

•4 0 

-0.19 

0.038 

Meeting with BT and edver hrst-ycor teachers 

285 

9.2 

194* 

0 

34 

0.00 

2.3 

3.6 

•12 

•0.09 

0285 

MccUng wth BT and ether teachers 

18.8 

184 

3.3 

0 

09 

0.320 

68 

10.1 

•33 

-0.14 

0.138 

Modeling a lessen 

90 

86 

3.3' 

0.18 

0.032 

21 

4.0 

•15 

•0.12 

0208 

Co-toaiJiivq a lesson 

58 

4.2 

1.6 

0.09 

0.314 

19 

2.6 

•0.7 

0.04 

0.665 

All su actvties (all mcntcrsl 

130.0 

67.1 

62.9* 

0.38 

0.00 

21.5 

35.8 

-143- 

-0.19 

0.049 

Al so actMUes (study menlor only) 

1106 

00 

110.6' 

1.19 

o.oco 

nla 

nla 

nla 

nla 

n'a 

T spies ol Assistance a Menlor Prodded (Percenlacci 

Suggestions to rroreve pracscc 

77.4 

33.1 

24.4' 

n'a 


0.00 

149 

269 

• 12.1' 

ma 

0.001 

Encouragement or moral supped 

888 

683 

21.3' 

n'a 


0.00 

207 

32.8 

-12.1' 

nla 

0.001 

Opoo ranty to raae tssuesidscuss concerns 

B3.9 

64.7 

21.3' 

n'a 


0.00 

17.7 

31.6 

-139- 

nla 

o.oco 

Help »tth admristratva'ogotiO issues 

67.2 

32.9 

14.3' 

n'a 


0.001 

124 

24.6 

-122* 

nla 

0.001 

Help teaching to meet state or dstnet standards 

61.1 

44.1 

17.0' 

n'a 


o.oco 

109 

19.3 

•84' 

ma 

0.010 

Help identifying teaching challenges and sotutens 

822 

34.8 

274' 

n'a 


o.oco 

139 

23.0 

-91' 

ma 

0.013 

Discus sod nsauclcnat ocas and ways so achcve them 

726 

481 

245* 

n'a 


0.00 

140 

24.4 

.104' 

ma 

0.001 

Guidance on now so assess ssudonts 

58.1 

43.7 

14.4' 

n'a 


o.oco 

109 

21.2 

• 104' 

ma 

0002 

Snared esson plans, assignments or other insouctcnai a: turtles 

55.9 

484 

7.3 

n'a 


0.110 

134 

22.3 

•9.1* 

ma 

0014 

Acted on BT s rejuesf 

71.9 

307 

21.1' 

n'a 


0.00 

120 

20.3 

•86' 

ma 

0.015 

Sample Size ilcachcrsl 

258 

243 

503 




241 

231 

472 




Score Matbcmatica First and Tlyrd nducacn ActMtlos Surveys aamnstcrod n fal 20CO arid tail 2036 to all stud, teachers 

t*a1e: Data pertain VD teachers tr one year cisVKts particpoirg n the study Cur.a are weehied and regression adustcc usr^g ordnary least squares to aoccurr. ter altciertces n districts. teacher 

trade assgnmcnts. study design and the clustering ol teachers wdin seneots Sample sizes vaiy due to Item rorcsponsc. 

'Ettect sizes are re pert ed ter contnuaus measures txf arc ra rcicaled ter dchcoiraus vanaoes that are rcpcOod as percentages 

1 ho product ot tnc mean trcouency are mean average duralen does mat ncccssarly equal the moan o( tctal time. 

'Tetd sample stzeis 396 in tall 2035. <14 1 in tall 2006 The gueslon did not apply to teachers wha did not m*e a rccuost to ther merlcrs. 

'SignitcarUy dfterent Mm zero at the 0 05 level 

BT = twgmre teacher, n'a ; not appliczoc 





Table B.4. Impacts on Teacher-Reported Professional Development Activities During Past Three Months, Fall 2005 and Fall 2006: One-Year Districts 


Fal 2005 Fall 2006 


Aspect of Professional Development 

Treatment 

Control 

Difference 

Effect 

Size* 

P-value 

Treatment 

Control 

Difference 

Effect 

Size' 

P-value 

Actrvites Completed (Percentages) 











Kept a written log 

39.9 

32.5 

7.5 

rva 

0.072 

27.0 

28.5 

-1.5 

na 

0.718 

Kept a portfolio and analysis of student 
work 

71.6 

77.5 

-5.9 

rva 

0.121 

75.2 

74.7 

0.5 

na 

0.897 

Worked with a study group of new 
teachers 

65.5 

34.4 

31.0' 

rva 

0.000 

10.5 

20.9 

-10.4' 

na 

0.003 

Worked with a study group of new and 
expenenced teachers 

47.8 

42.1 

5.7 

na 

0.182 

37.8 

39.8 

-1.9 

na 

0.669 

Observed others teaching in their 
classrooms 

61.3 

44.2 

17.1* 

n'a 

0.000 

28.0 

26.3 

1.7 

na 

0.685 

Observed others teaching your class 

51.1 

50.6 

0.5 

rva 

0.913 

26.9 

32.1 

-5.2 

na 

0.239 

Met with principal to discuss teaching 

68.8 

70.4 

-1.6 

n'a 

0.693 

45.0 

51.0 

-6.0 

na 

0.232 

Met with literacy or mathematics coach or 
other curricular specialist 

77.5 

77.1 

0.4 

n'a 

0.900 

77.8 

75.8 

1.9 

na 

0.668 

Met with a resource specialist to cfcscuss 
needs of particular students 

Frequency of Selected Actrvlt.es (Number of 
Times During Past Three Months) 

65.5 

77.2 

-11.7* 

na 

0.005 

70.8 

77.8 

-7.0 

na 

0.067 

Teaching was observed by mentor 

4.0 

1.5 

2.5' 

0.98 

0.000 

0.3 

0.6 

-0.3’ 

-0.21 

0.024 

Teaching was observed by principal 

2.3 

2.6 

-0.3 

-0.13 

0.218 

1.9 

1.8 

0.1 

0.03 

0.758 

Given feedback on your teaching, not as 
part of formal evaluation 

3.2 

2.4 

0.8' 

0.37 

0.000 

1.4 

1.6 

-0.2 

-0.11 

0.259 

Given feedback on your teaching, as part 
of formal evaluation 

1.7 

1.4 

0.3 

0.17 

0.077 

0.7 

0.7 

-0.1 

-0.04 

0.659 

Given feedback on your lesson plans 

1.6 

1.7 

-0.1 

-0.04 

0.683 

1.0 

1.4 

-0.3 

-0.17 

0.079 


Sample Size (Teachers) 258 245 503 241 231 472 

Source: Mathematics First and Third Induction Activity Surveys administered in fall 2005 and fall 2CXJ6 to all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Data are weighted and regression adjusted using ordinary least squares to account for 

differences in districts, teacher grade assignments, study design, and the clustering of teachers within schools. Sample sizes vary due to item nonresponse. 

‘Effect sizes are reported for continuous measures, but are not indicated for dichotomous vanables that are reported as percentages 

•Significantly different from zero at the 0.05 level. 

n/a - not applicable. 
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Table B.5. Impacts on Teacher-Reported Areas of Professional Development During the Past Three Months, Fall 2005 and Fall 2006: One-Year Districts 


Attended Professional Development Activities (Percentages) 


Fall 2005 Fall 2006 


Professional Development Topic 

Treatment 

Control 

Otference 

P-value 

Treatment 

Centre! 

Difference 

P-vafue 

Parenl and community relations 

37.3 

28.9 

8.3 

0.052 

17.1 

17.2 

0.0 

0.997 

School policies on student disciplinary procedures 

46.1 

54.4 

-8.3 

0.052 

47.6 

47.9 

-0.3 

0.951 

Instructional technicjjesi'strategies 

77.7 

82.0 

-4.3 

0.297 

71.0 

68.9 

2.1 

0.664 

Understanding the composition ot students rt your class 

24.9 

26.0 

-1.1 

0.773 

21.1 

23.5 

•2.5 

0.546 

Content area knowledge (language arts, mathematics, science) 

61.1 

72.1 

-10.9‘ 

0.006 

67.5 

65.2 

2.3 

0.617 

Lesson planting 

30.2 

32.1 

-1.9 

0.641 

22.1 

24.3 

-2.1 

0.591 

Analyzing student workiassessment 

44.7 

50.1 

-5.4 

0239 

41.9 

44.1 

-2.2 

0.635 

Student motivaticm'engagement 

36.2 

35.5 

0.7 

0.876 

24.5 

24.5 

•0.1 

0.991 

Dflerentaled instruction 

52.5 

49.0 

3.6 

0.466 

42.0 

45.9 

•3.9 

0.392 

Using compulers to support instruction 

26.7 

34.7 

-7.9 

0.062 

38.7 

38.6 

0.1 

0.984 

Classroom management techniques 

52.7 

54.5 

-1.8 

0.711 

23.7 

30.2 

•6.5 

0.105 

Preparing students for slandardzed testing 

30.2 

40.9 

-10.8‘ 

0.018 

29.2 

34.9 

•5.8 

0.177 

Sample Size (Teachers) 

258 

245 

503 


241 

231 

472 



Source: Mathematics First and Third Induction Activities Surveys administered In fafl 2005 and fall 2006 to ail study teachers 

Note Data pertan to teachers in one-year districts participating in the study. Data are weighted and regression adjusted using ordinary least squares to account tor 

differences in dstricls. teacher grade assignments, study design, and the clustering of teachers within schools. Sample sizes vary due to item nonresponse. 


•Significantly different from zero at the 0.05 level. 



Table B.6. Teacher Reports on Professional Support and Duties (Percentages), Spring 2006: One-Year Districts 





Spring 2006 



Treatment 

Control 

Difference 

P-vaiue 

BT has a mentor 

90.3 

80.3 

10.0" 

0.001 

BT has an assigned mentor 

89.7 

72.2 

17.5" 

0.000 

Sample Size (Teachers) 

258 

241 

499 



Source: Mathematics Second Induction Activities Survey administered in spring 2006 to all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Data are weighted and regression adjusted using ordinary least squares to 

account for differences in districts, teacher grade assignments, study design, and the clustering of teachers within schools. Sample sizes vary due to 
item nonresponse. 

"Significantly different from zero at the 0.05 level. 

BT = beginning teacher. 



Table B.7. Impacts on Teacher-Reported Mentor Profiles (Percentages). Spring 2006: One-Year Districts 





Spnng 2006 


Mentoring Characteristic 

Treatment 

Control 

Difference 

P-value 

Number of Mentors 

Multiple Mentors 

21.7 

11.8 

10.0* 

0.009 

Number of Mentors 

None 

9.7 

19.7 

-10.0' 

0.001 

One 

68.6 

68.6 

0.0 

0.998 

Two 

19.0 

9.6 

9.4' 

0.008 

Number of Mentors Assigned 

No mentor assigned 

10.3 

27.8 

-17.5* 

0.000 

One mentor assigned 

71.3 

65.4 

5.9 

0.209 

Two mentors assigned 

18.4 

6.8 

11.6* 

0.001 

Mentor Positions 

Positions of All Mentors 

Full-time mentor 

72.1 

10.4 

61.8* 

0.000 

Teacher 

24.5 

66.1 

-41.6* 

0.000 

School or district administrator or staff external to district 

10.9 

6.3 

4.6 

0.072 

No mentor 

9.7 

19.7 

-10.0* 

0.001 

Sample Size (Teachers) 

258 

241 

499 



a 

i Source: Mathematca Second Induction Activities Survey administered in spring 2006 to all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Data are weighted and regression adjusted using ordinary least squares to 

account for differences in districts, teacher grade assignments, study design, and the clustering of teachers within schools. Sample sizes vary due to 
item nonresponse. 

•Significantly different from zero at the 0.05 level. 
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Table B.8. Impacts on Teacher-Reported Mentor Services Received in the Most Recent Full Week of Teaching, Spring 2006: One-Year Districts 


Spnng 2006 


Mentor Service 

Treatment 

Control 

Difference 

Effect Size' 

P -value 

* Jsua' Meetings with Mentors 






Frequency (number of meetings) 

12 

12 

0.1 

0.03 

0.750 

Average duration (minutes) 

23.4 

110 

12.4* 

068 

0600 

Total lime" fmirulcs) 

56.9 

338 

22.1* 

0.35 

0.000 

Informal Meetings with Mentors 

288 

33.7 

■49 

•0.12 

0.197 

Total Into (minutes) 






Total Usual and Informal Time with Mentors (Minutes) 

84.7 

67.5 

172* 

020 

0.039 

Meeting Time wJi Mentors in tie Folowing Postons (Minutes) 






Fulltime mentor 

52.4 

6.0 

46 4" 

068 

OOCO 

Teacher 

25.9 

59.0 

•33.1* 

•0.41 

0600 

Administrator 

3.9 

25 

1.4 

0.08 

0.427 

Stall external lo dotncl 

2.3 

02 

2.1* 

0.16 

0.046 

Mentor Time in the FolOwing Activities (Minutes) 






Observing BT teaching 

26.8 

7.4 

19.4“ 

065 

OOOO 

Meeting with BT one-on-one 

31.4 

208 

10.6* 

0.33 

0.000 

Meeting with BT and other test-year teachers 

238 

6.0 

17.8“ 

0.51 

0.000 

Meeting wth BT and other teachers 

12 2 

13.7 

•1.6 

•0.05 

0 548 

Modeling a lesson 

8.4 

5.4 

30 

0.16 

0.077 

Co -teaching a lesson 

5.1 

56 

•0.4 

•0.02 

0820 

All six actnilies |al mentors) 

1078 

588 

489“ 

0.49 

0.000 

All six activities (study mentor only) 

96.0 

0.0 

95.0“ 

1.15 

0.000 

Types of Assistance a Mentor Provided (Percentage) 






Suggestions to improve practice 

662 

52.0 

142* 

n'a 

0.001 

Encouragement cr moral sippoit 

77.7 

678 

9.9“ 

n'a 

0010 

Opportunity to raise issues.'discuss concerns 

768 

652 

11.6* 

n'a 

0.003 

Help with acministraliveilogislical issues 

60.4 

50 7 

9.7“ 

n'a 

0.022 

Help wilh leaching to meet state or distnet standards 

528 

41.6 

11-3* 

n'a 

0.007 

Help idenbfying teaching challenges and solutions 

63.6 

52.4 

112“ 

n'a 

0007 

Discussed instructional goals and ways Vo achieve Ihem 

61.3 

40.9 

20.4“ 

n'a 

0.000 

Guidance on how to assess students 

53.3 

370 

16.3“ 

n'a 

0.00*3 

Shared lesson plans, assignments, or other instrumental activities 

55.7 

469 

8.8“ 

n'a 

0049 

Acted cn BTs regucsf 

612 

463 

148“ 

n'a 

0.002 

Sample Size (Teachers) 

258 

241 

499 




Source: Mathematics Second Induction Activities Survey administered in spring 2006 to all study teachers. 

Note: Data pertain to teachers in ore-year districts participatng in the study. Data are weighted and regression adjusted using ordnary least squares to accotnt ter differences n districts, 

teacher grade assignments, stud)' desgn. and the clustering of teachers within schools. Sample sices vary due to item nonresponse 

“Effect sices are reported for continuous measures tut are not indicated for dichotomous variables that are reported as percentages 

'The product of the mean frequency and mean average duration does not necessanly equal the mean of total lime. 

‘Total sample sice is 393. The guesSon did rot apply to teachers who did not make a request to Iheir mentors. 

“Significantly different from cero al the 0.06 level. 

BT - begnning teacher, m'a ■ not applicable. 
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Table B.9. Impacts on Teacher-Reported Professional Development Activities During Past Three Months, Spring 2006: One-Year Districts 

Spring 2006 

Aspect of Professional Development Treatment Control Difference Effect Size” P-value 

Activities Completed (Percentages) 

Kept a written log 

Kept a portfolo and analysis of student work 
Worked with a study group of new teachers 
Worked with a study group of new and experienced teachers 
Observed others teaching in their classrooms 
Observed others teaching your class 
Met with principal to discuss teaching 

Met with a literacy or mathematics coach or other curricular specialist 
Met with a resource specialist to discuss needs of particular students 

Frequency of Selected Activities (Number of Times During Past Three 
Months) 

Teaching was observed by mentor 
Teaching was observed by principal 

Given feedback on your teaching, not as part of formal evaluation 
Given feedback on your teaching, as part of formal evaluation 
Given feedback on your lesson plans 

Sample Size (Teachers) 258 241 499 

Source: Mathematica Second Induction Activities Survey administered in spring 2006 to all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Data are weighted and regression adjusted using ordinary least squares to 

account for differences in districts, teacher grade assignments, study design, and the clustering of teachers within schools. Sample sizes vary due to 
item nonresponse. 

'Effect sizes are reported for continuous measures but are not indicated for dichotomous variables that are reported as percentages. 

•Significantly different from zero at the 0.05 level, 
n'a = not applicable. 


38.1 

29.4 

8.7' 

n/a 

0.036 

77.3 

72.7 

4.6 

n/a 

0.250 

71.1 

29.1 

42.0’ 

n/a 

0.000 

45.1 

40.2 

4.9 

n/a 

0.286 

67.5 

38.7 

28.8’ 

n/a 

0.000 

44.7 

39.3 

5.4 

n/a 

0.264 

63.7 

68.8 

-5.1 

n/a 

0.288 

69.9 

68.4 

1.5 

n/a 

0.737 

57.6 

65.3 

-7.7 

n/a 

0.085 


3.5 

1.5 

2.0' 

0.83 

0.000 

1.9 

2.1 

-0.2 

-0.09 

0.377 

2.5 

1.9 

0.6’ 

0.30 

0.001 

1.6 

1.4 

0.2 

0.13 

0.153 

1.3 

1.5 

-0.2 

-0.13 

0.187 
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Table B.10. Impacts on Teacher-Reported Areas of Professional Development During the Past Three Months. Spring 2006: 
One-Year Districts 


Attended Professional Deve+opment Ac twites (Percentages) 
Spring 2006 


Professional Development Topic 

Treatment 

Control 

Difference 

P-value 

Parent and community relations 

24.7 

22.9 

1.8 

0.635 

School policies on student disciplinary procedures 

32.1 

44.9 

-12.9' 

0.006 

Instructional techniques/strategies 

70.1 

73.5 

-3.4 

0.380 

Understanding the composition of students in your class 

20.6 

21.3 

-0.6 

0.861 

Content area knovitedge (language arts, mathematics, science) 

59.2 

67.7 

-8.5 

0.051 

Lesson planning 

33.0 

21.6 

11.4* 

0.005 

Analyzing student worki'assessment 

52.4 

42.7 

9.7- 

0.041 

Student motivationc'engagement 

30.1 

29.0 

1.2 

0.783 

Differentiated instruction 

49.1 

44.3 

4.8 

0.283 

Using computers to support instruction 

24.3 

32.0 

-7.7 

0.082 

Classroom management techniques 

33.2 

39.9 

-6.7 

0.162 

Preparing students for standardized testing 

48.2 

52.7 

-4.5 

0.218 

Sample Size (Teachers) 

258 

241 

499 



Source: Mathematica Second Induction Activities Survey administered in spring 2006 to all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Data are weighted and regression adjusted using ordinary least squares to 

account for differences in districts, teacher grade assignments, study design, and the clustering of teachers wthin schools. Sample sizes vary due to 
item nonresponse. 

•Significantly different from zero at the 0.05 level. 




Table B.11. Teacher Reports on Professional Support and Duties (Percentages), Fall 2007 and Fall 2008: One-Year Districts 



Fall 2007 



Fall 2008 


Treatment 

Control Difference 

P-value 

Treatment 

Control Difference 

P-value 


BT has mentor 

20.0 

25.3 

-5.3 

0.194 

12.7 

14.1 

-1.4 

0.699 

BT has assigned mentor 

11.3 

13.6 

-2.3 

0.478 

7.8 

8.3 

-0.6 

0.826 

Sample Size (Teachers) 

219 

207 

426 


206 

192 

398 



Source: Mathematica Fifth and Sixth Induction Activities Surveys administered in fall 2007 and fall 2008 to al study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Data are weighted and regression adjusted using ordinary least squares to 

account for differences in districts, teacher grade assignments, study design, and the clustering of teachers within schools. Sample sizes vary due to 
item nonresponse. 

"Significantly different from zero at the 0.05 level. 

BT = beginning teacher. 
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Table B.12. Impacts on Teacher-Reported Mentor Profiles (Percentages). Fall 2007 and Fall 2008: One-Year Districts 




Fal 2007 



Fall 2008 


Mentoring Characteristic 

Treatment 

Control Difference 

P-value 

Treatment 

Control Difference 

P-value 


Number of Mentors 

None 

80.0 

74.7 

5.3 

0.194 

87.3 

85.9 

1.4 

0.699 

One 

16.3 

19.5 

-3.2 

0.386 

7.8 

12.3 

-4.4 

0.145 

Two or more 

3.7 

5.8 

-2.1 

0.353 

5.3 

2.2 

3.1 

0.118 

Number of Mentors Assigned 

None 

88.7 

86.4 

2.3 

0.478 

92.2 

91.6 

0.6 

0.818 

One or more 

11.3 

13.6 

-2.3 

0.478 

7.8 

8.4 

-0.6 

0.818 

Mentor Positions 

Teacher 

15.1 

19.9 

-4.9 

0.184 

10.8 

9.8 

0.9 

0.767 

Other Position 

3.6 

4.8 

-1.2 

0.572 

2.5 

5.9 

-3.4 

0.111 

No mentor 

80.0 

74.7 

5.3 

0.194 

87.3 

85.9 

1.4 

0.699 

Sample Size (Teachers) 

219 

207 

426 


206 

192 

398 



Source: Mathematca Fifth and Sixth Induction Activities Surveys administered in fall 2007 and fall 2008 to al study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Data are weighted and regression adjusted using ordinary least squares to 

account for differences in districts, teacher grade assignments, study design, and the clustering of teachers within schools. Sample sizes vary due to 
item nonresponse. 

None of the differences is statistically significant at the 0.05 level. 
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Table B.13. Impacts on Teacher-Reported Mentor Services Receiver 
Districts 


Mentor Service Treatment Central 

'Used' Meetngs wrlh Mentors 

Frequency (nutnOer of meebngs) 0.3 0.4 

Average durabon (minutes) 1.4 28 

Total bmo ! (minutes) 6.7 12.7 

Informal Meetings with Mentors 

Tolaltme (minutes) 6.7 10.5 

Total Usual and Informal Time with Mentors (Minutes) 13.3 23.1 

Meebrg Time v"th Mentors in the Folowirg Positions 
(Minutes) 

Full-brne mentor 0.5 1.6 

Teacher 11.4 22.1 

Acminislrator 0.3 0.0 

Staff external to dotncl 0.0 0.0 

Mentor Time in the Folkwrvg Actnibes (Minutes) 

Observing BT teachrtg 1.6 2.6 

Meebrg with BT ore on-one 4 9 6.9 

Meebrg with BT and other first-year teachers 0.7 2.1 

Meebrg with BT ard other teachers 4 4 7.7 

Modeling a lesson 02 3.0 

Co teaching a lesson 02 1.5 

All six acViities (dl mentors) 12.1 238 

All so acli vibes (stud, 1 merlor orty) 0.0 0.0 

Types of Assislanoe a Mentor Provided (Percentage) 

Suggestions to improve D'acbce 14.9 14.3 

Encouragement or moral support 17.9 20.3 

Opportunity to raise issues'discuss concerns 17.1 202 

Help with administrabvelogistioal issues 14.7 16.5 

Help teach ng to meet slate or distnet stardarCs 7.0 1 5.5 

Help idenbfyirg teachng challenges and solubons 12.9 16.3 

Discussed instructional goals and ways to achieve 9.1 15.0 

them 

Guidance on how to assess students 1 2.0 1 5.5 

Shared lesson plans, assignments, or oher 118 168 

nslrucbonal aotmbes 

Acted cn BTs request" 6 5 


15.1 


in the Most Recent Full Week of Teaching. Fall 2007 and Fall 2008: One-Year 


Fall 2007 Fal 2008 


Effect Effect 


Difference 

Sute 1 

P-value 

Treatment 

Control 

Difference 

Size* 

P-value 

•0.1 

•0.12 

0.267 

02 

0.1 

0.1 

0.10 

0.354 

•1.4* 

•0.18 

0.047 

1.4 

2.0 

•0.6 

0.09 

0.397 

-6.0 

•0.14 

0.164 

6.7 

4.3 

1.4 

0.08 

0566 

•3.8 

0.13 

0.196 

7.0 

7.0 

0.0 

0.00 

0992 

-98 

•0.15 

0.138 

12.7 

11.3 

1.5 

0.03 

0.770 


•1.2 

0.11 

0.337 

0.6 

09 

•0.3 

0.05 

0675 

10.7 

•0.16 

0.128 

106 

7.9 

2.6 

0.07 

0.563 

0.3 

0.05 

0.379 

06 

2.2 

• 1.7 

0.18 

0.092 

0.0 

0.10 

0.316 

0.0 

0.2 

•02 

0.11 

0323 


•1.0 

•0.10 

0.350 

12 

0.4 

08 

0.10 

0227 

•2.0 

-0.1 1 

0.236 

3.6 

3.0 

0.6 

0.04 

0695 

•1.4 

•0.17 

0.108 

12 

0.1 

1.1 

0.16 

0.099 

•3.3 

•0.17 

0.094 

3.7 

1.6 

2.0 

0.15 

0.134 

•2.8* 

•025 

0.021 

0.7 

0.3 

0.4 

0.08 

0.566 

•1.3 

•020 

0.056 

0.3 

0.2 

0.2 

0.04 

0.603 

11.7* 

•022 

0.000 

10.7 

5.0 

5.7 

0.16 

0097 

0.0 



06 

0.0 

0.6 

0.07 

0.332 

0.6 

n'a 

0.857 

78 

7.7 

0.1 

m'a 

0972 

•2.4 

n'a 

0.534 

11.0 

108 

0.2 

nfa 

0963 

•3.0 

n'a 

0.413 

10.4 

11.1 

•0.7 

nfa 

0830 

•1.8 

n'a 

0.590 

7.5 

9.1 

• 1.6 

nta 

0.564 

•8.5* 

n'a 

0.037 

38 

5.6 

• 18 

nfa 

0.414 

•3.4 

n'a 

0.318 

7.7 

7.2 

0.5 

nfa 

0861 

•5.9 

n'a 

0.062 

58 

5.7 

0.2 

nfa 

0945 

•3.6 

n’a 

0.299 

5.1 

52 

•0.1 

nfa 

0965 

•5.0 

n'a 

0.140 

86 

7.3 

1.3 

n<a 

0665 


n'a 


m'a 


8.6 


0.007 


5 5 


0.9 


0716 




Table B.l 3 (continued) 


Source; Mathematica Fifth and Sixth Induction Activities Surveys administered in fall 2037 and fall 2C08 to all study teachers 

Note Data pertain to teachers in one-year districts participating in the study. Data are weighted and regression adjusted using ordinary least squares to account for 

differences in dstricls. teacher grade assignments, study design, and the clustering of teachers within schools. Sample sizes vary due to item nonresponse. 

"Effect sizes are reported for continuous measures but are not Indicated for dichotomous variables that are reported as percentages. 

The product of the mean frequency and mean average duration does not necessanty equal the mean of total time. 

Total sample size is 393 in fall 2007 and 390 in fall 2008. The question did not apply to teachers who dd not make a request to their mentors. 

'Significantly different from zero at the 0.05 level. 

BT • beginning teacher; nfa • not applicable. 
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Table B.14. Impacts on Teacher-Reported Professional Development Activities During Past Three Months, Fall 2007 and Fall 2008: 
One-Year Districts 


Fal 2007 Fall 2008 


Effect Effect 


Aspect of Professional Development 

Treatment 

Control 

Difference 

Size* 

P-value 

Treatment 

Contra 

Difference 

Size* 

P-value 

Activities Competed (Percentages) 

Kept a written log 

23.3 

24.1 

-0.8 

n/a 

0.833 

21.3 

20.3 

0.9 

n/a 

0.818 

Kept a portfolio and analysis of student work 

77.7 

73.9 

3.7 

n/a 

0.356 

732 

79.7 

-6.4 

m’a 

0.094 

Worked with a study group of new teachers 

13.1 

12.2 

0.9 

n/a 

0.772 

13.7 

16.5 

•2.9 

n/a 

0.466 

Worked with a study group of new and 

41.8 

41.5 

0.4 

n/a 

0.935 

38.0 

36.3 

1.7 

n/a 

0.735 

experienced teachers 

Cb served Ohers teaching in their classrooms 

29.2 

31.7 

-2.5 

rtfa 

0.551 

30.7 

32.6 

-1.9 

n/a 

0.686 

Observed outers teaching your class 

27.1 

28.8 

-1.7 

n’a 

0.713 

25.0 

35.0 

-10.0‘ 

n/a 

0.036 

Met with principal to discuss teaching 

44.1 

49.3 

-5.2 

n/a 

0.288 

53.1 

52.6 

0.4 

n/a 

0.935 

Met with literacy cr mathematics coach or 
other curricular specialist 

74.5 

75.5 

•0.9 

rtfa 

0.832 

70.2 

73.3 

-3.1 

n/a 

0.537 

Mel with a resource specialist to dscuss 
needs of particular students 

75.2 

73.9 

1.4 

rtfa 

0.745 

70.5 

73.6 

•3.0 

n/a 

0482 


Frequency of Selected Activities (Number of 
Times Du-ing Past Three Months) 


Teaching was observed by mentor 

0.3 

0.4 

0.0 

-0.04 

0.697 

0.2 

0.3 

•02 

-0.18 

0096 

Teaching was observed by principal 

1.9 

1.9 

0.0 

0.00 

0.981 

1.8 

1.7 

0.1 

0.05 

0.661 

Given feedback on ycxir teaching, not as pari 
of formal evakialion 

1.5 

1.6 

•0.1 

•0.07 

0.550 

1.3 

1.5 

•02 

-0.09 

0.364 

Given feedback on your teaching, as part of 
formal evaluation 

0.7 

0.8 

0.0 

•0.04 

0.718 

0.6 

0.6 

0.0 

-0.03 

0.814 

Given feedback on your lesson plans 

1.0 

1.3 

-0.2 

•0.12 

0251 

0.9 

1.4 

•0.4 

•0.24 

0.027 

Sample Size (Teachers) 

219 

207 

426 



206 

192 

398 




Source: Mathematics Fifth and Sixth Induction Activities Surveys administered in fall 2017 and fall 2C08 to all study teachers. 

Note Data pertan to teachers in one-year districts participating m the study. Data are weighted and regression adjusted using ordinary least squares to account tor 

differences in dstricls. teacher grade assignments, study design, and the clustering of teachers within schools. Sample s<zes vary due to item nonresponse. 

"Effect sizes are repotted for continuous measures but are not indicated for dichotomous variables that are reported as percentages. 

•Significantly different from zero at the 0.05 level. 

n'a • not applicable. 
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Table B.15. Impacts on Teacher-Reported Areas of Professional Development During the Past Three Months, Fall 2007 and Fall 2008: 
One-Year Districts 


Attended Professional Development Activities (Percentages) 
Fall 2007 Fall 2008 


Professional Development Topic 

Treatment 

Control 

Difference 

P-value 

Treatment 

Control 

Difference 

P-value 

Parent and community relations 

T33 

15.2 

315 

0.808 

T273 

15.1 


0.431 

School policies on student disciplinary procedures 

45.5 

47.8 

-2.3 

0.643 

48.8 

52.0 

-3.3 

0.530 

Instructional techniques/strategies 

76.7 

81.6 

-4.9 

0.191 

74.8 

80.4 

-5.6 

0.170 

Understanding the composition of students in your 
class 

22.4 

26.0 

-3.7 

0.394 

19.3 

20.9 

-1.5 

0.676 

Content area knowledge (language arts, 
mathematics, science) 

63.5 

72.2 

-8.7 

0.060 

65.7 

68.2 

-2.5 

0.607 

Lesson planning 

21.8 

24.2 

-2.5 

0.565 

20.4 

28.0 

-7.6 

0.069 

Analyzing student worki'assessment 

46.8 

54.3 

-7.5 

0.116 

42.1 

55.1 

-13.1* 

0.009 

Student motivation'engagement 

21.1 

28.2 

-7.1 

0.122 

15.4 

21.1 

-5.6 

0.138 

Differentiated instruction 

41.5 

51.6 

-10.2* 

0.029 

43.3 

49.6 

-6.3 

0.211 

Using computers to support instruction 

44.0 

38.9 

5.1 

0.290 

39.5 

42.1 

-2.5 

0.631 

Classroom management techniques 

20.2 

21.1 

-0.8 

0.838 

20.5 

24.9 

-4.4 

0.290 

Preparing students for standardized testing 

24.8 

26.2 

-1.4 

0.721 

23.8 

30.8 

-7.0 

0.147 

Sample Size (Teachers) 

219 

207 

426 


206 

192 

398 



Source: MathematCa Fifth and Sixth Induction Activities Surveys administered in fall 2007 and fall 2008 to al study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Data are weighted and regression adjusted using ordinary least squares to 

account for differences in districts, teacher grade assignments, study design, and the clustering of teachers within schools. Sample sizes vary due to 
item nonresponse. 

"Significantly different from zero at the 0.05 level. 



Table B.16. Teacher Reports on Professional Support and Duties (Percentages), Fall 2005 and Fall 2006: Two-Year Districts 




Fall 2005 



Fall 2006 



T reatment 

Control 

Difference 

P-value 

Treatment 

Control 

Difference 

P-value 

BT has mentor 

97.5 

85.7 

11.8* 

0.001 

80.4 

41.0 

39.4" 

0.000 

BT has assigned mentor 

93.9 

78.7 

15.2* 

0.000 

80.0 

33.5 

46.6" 

0.000 

Sample Size (Teachers) 

213 

182 

395 


191 

169 

360 



Source: Mathematica First and Third Induction Activities Surveys administered in fa« 2005 and fa* 2006 to all study teachers. 

Note: Data pertain to teachers in two-year districts participating in the study. Data are weighted and regression adjusted using ordinary least squares to 

account for differences in districts, teacher grade assignments, study design, and the flustering of teachers within schools. Sample sizes vary due to 
item nonresponse. 

"Significantly different from zero at the 0.05 level. 

BT = beginning teacher. 
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Table B.17. Impacts on Teacher-Reported Mentor Profiles (Percentages). Fall 2005 and Fall 2006: Two-Year Districts 




Fall 2005 



Fall 2006 


Mentoring Characteristic 

Treatment 

Control 

Difference 

P-value 

Treatment 

Control 

Difference 

P-value 

Number of Mentors 

Multiple Mentors 

38.2 

22.8 

15.4’ 

0.002 

10.6 

13.2 

-2.6 

0.528 

Number of Mentors 

None 

2.5 

14.3 

-11.8* 

0.001 

19.5 

59.0 

-39.4* 

0.000 

One 

59.3 

63.0 

-3.6 

0.537 

69.9 

27.9 

42.0* 

0.000 

Two 

32.1 

17.7 

14.4’ 

0.001 

10.6 

13.2 

-2.6 

0.528 

Number of Mentors Assigned 

No mentor assigned 

6.1 

21.3 

-15.2* 

0.000 

20.0 

66.5 

-46.6’ 

0.000 

One mentor assigned 

62.8 

65.7 

-2.9 

0.630 

72.8 

26.5 

46.2- 

0.000 

Two mentors assigned 

31.1 

13.1 

18.1* 

0.000 

7.3 

6.9 

0.3 

0.905 

Mentor Positions 

Positions of All Mentors 

71.5 

15.8 

55.7* 

0.000 

63.6 

6.5 

57.1* 

0.000 

Full-time mentor 

Teacher 

38.2 

61.9 

-23 .7* 

0.000 

11.9 

26.8 

-14.8* 

0.002 

School or district administrator or staff 

13.2 

14.7 

-1.4 

0.709 

10.0 

8.9 

1.1 

0.723 

external to district 

No mentor 

2.5 

14.3 

-11.8* 

0.001 

19.5 

59.0 

-39.4* 

0.000 

Sample Size (Teachers) 

213 

182 

395 


191 

169 

360 



Source: Mathemabca First and Third Induction Activities Surveys administered in fal 2005 and fal 2006 to all study teachers. 

Note: Data pertain to teachers in two-year districts participating in the study. Data are weighted and regression adjusted using ordinary least squares to 

account for differences in districts, teacher grade assignments, study design, and the clustering of teachers within schools. Sample sizes vary due to 
item nonresponse. 

•Significantly different from zero at the 0.05 level. 
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Table B.18. Impacts on Teacher-Reported Mentor 
Two-Year Districts 

Services 

Received 

in Most 

Recent 1 

Full Week 

of Teaching. 

Fall 2005 and Fall 

2006: 





Fall 2006 






Fall 2006 



Mentor Service 

Treatment 

Central 

Difference 

Effect 

Site" 

P-vatoe 

Treatment 

Control 

Difference 

Effect 

Site* 

P- value 

•UiuS" Meetngs with Mentors 












Frequency (numher of meebrvgs) 

17 

1.4 

04* 

021 

0049 

1.3 

0.8 

os- 

0.29 

0.011 

Average duration (mnutes) 

24.4 

11.5 

12.9- 

0.71 

0.000 

18.8 

7.0 

iie- 

0.68 

0CO3 

Total time’ (minutes) 

785 

43.3 

352" 

0.40 

0.001 

548 

29.5 

253- 

0.28 

0.032 

Informal Meetings with Mentors 












Total tme Iminutes) 

45.5 

37.7 

78 

0.17 

0.127 

27.0 

18.2 

88 

0.23 

0.051 

Total Usual and Informal Time with Mentors (Minutes) 

124.0 

80.9 

43.0" 

0.38 

0.002 

818 

47.7 

34.1- 

0.29 

0.024 

Meeting Time «w»i Mentors in he Folcrxing Postons (Mnutes) 












Fullbme mentor 

74.8 

6.4 

68.4* 

085 

0.000 

S9.3 

1.9 

57.4- 

0.90 

0.000 

Teacher 

39.3 

69.9 

•30.6- 

•0.34 

0 003 

14: 

• 

41.9 

•27.7- 

•0.28 

0.043 

Administrator 

6.5 

2.4 

4.1 

021 

0.093 

6: 

f 

3.2 

3.0 

0.14 

0.173 

Stall extern^ to district 

57 

1.9 

3.3 

0.C9 

0 364 

2.1 

i 

0.4 

22 

0.10 

0.241 

Mentor Time in the Folkwng Actnilies iMinutes) 












Observing BT teaching 

37.5 

17.4 

20.1- 

0.55 

0.000 

21.8 

7.4 

14.3- 

0.53 

0.000 

Meebnq with BT one-on-one 

42.5 

232 

192' 

0.57 

0.000 

25.1 

11.7 

13.4- 

0.42 

0.000 

Mecbrg with BT and other frstyear teachers 

37.7 

11.4 

28.3- 

0.64 

oooo 

24.8 

5.8 

ISO- 

0.52 

0.000 

Meebrg with BT and other teachers 

23.3 

158 

7.5 

0.23 

0.055 

15.1 

11.4 

ST 

0.10 

0.330 

Modeling a lesson 

16.3 

9.7 

ea- 

023 

0.016 

11.1 

i 

4.7 

7.1- 

0.30 

0.003 

Co teaming a lesson 

128 

9.2 

se 

0.12 

0215 

7.: 

) 

3.0 

42 

0.22 

0.080 

All six actnilies (al mentors) 

1699 

868 

832- 

0.60 

oooo 

105.: 


44.1 

618- 

048 

O.COO 

All six activities (study mentor only) 

118.7 

0.0 

118.7- 

1.17 

oooo 

921 

i 

0.0 

928- 

0.97 

0.000 

Types ot Assistance Mentor Provided I Percentages) 












Suggests ns to improve pracbce 

81.1 

62.4 

i se- 

nfa 

oooo 

62- 

i 

229 

395- 

nfa 

O.COO 

Encouragement cr mural sioport 

918 

73.0 

ise- 

nfa 

oooo 

72; 

i 

29.5 

428- 

nfa 

0.000 

Opportunity to raise issues'discuss concerns 

896 

69.0 

20.6- 

nfa 

0.000 

71.1 

> 

28 1 

438- 

nfa 

O.COO 

Help with admnstratr.e'kigsbcal issues 

73.6 

59.7 

13.9- 

nfa 

0 004 

62: 

> 

24.1 

384- 

nfa 

0.003 

Help teachng to meet slate or distnet standards 

678 

508 

16.9- 

nfa 

0.002 

55.: 

• 

22.1 

33.0- 

nfa 

0.000 

Help identifying teachng challenges and soludons 

81.9 

57.5 

245- 

nfa 

0.000 

63.1 

i 

23.3 

405- 

nfa 

0.000 

Discussed instructional goals and ways Vo achieve them 

75.4 

48.4 

270- 

nfa 

0.000 

56: 

) 

25.7 

31 f 

nfa 

0.000 

Guidance on how to assess students 

65.7 

48.1 

17.5- 

nfa 

0.001 

49: 

l 

21.0 

286- 

nfa 

0.000 

Shared lesson plans, assignments, or other instaKdonal 

699 

53.7 

16.3- 

nfa 

0.004 

53.5 

25.1 

284- 

nfa 

0.000 

activities 












Acted cn BTs request 

77.9 

50.0 

27.9* 

nfa 

oooo 

59.7 

23.0 

36.7- 

nfa 

0.000 

Sample Site (Teachers) 

213 

182 

396 



191 


169 

360 





Table B.I8 ( continued ) 


Source: Mathematic* Frsl and Third Induction Act! vibes Surveys administered in (all 2005 and fall 2006 to all study teachers. 

Note Data pertain to teachers in two-year districts participating in the study. Data are weighted and regression adjjsted using ordinary least squares to account (or differences in 
districts, teacher grade assignments, study design, end the ckistering ot teachers within schools. Sample sizes vary due to item nonresponse. 

"Effect sizes are reported (or continuous measures but are not rtdicated (or dichotomous variables that ere reported as percentages. 

The product ct the mean (tequency and mean average duration does not necessaniy equal the mean ot total bme. 

Total sample size is 315 in (all 2005 and 313 in (all 2G06. The question did not apply to teachers who dd not make a request to their mentors. 

'Significantly different trom zero at the 0.05 level. 

BT • beginning teacher; nta • not applicable. 
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Table B.19. Impacts on Teacher-Reported Professional Development Activities During Past Three Months, Fall 2005 and Fall 2006: 
Two-Year Districts 


Aspect ot Professional Development 



Fall 2005 





Fall 20C6 




Treatment 

Control 

Difference 

Effect 

Size* 

P- value 

Treatment 

Control 

Difference 

Effect 

Size' 

P-vaiue 

Activities Completed (Percentages) 












Kept written leg 

40.3 

33.5 

6.7 

tVa 

0.221 

33.5 

31.6 

1.9 

a'a 

0.699 

Kept portfolio and analysis of student work 

82.4 

78.6 

3.8 

a'a 

0.362 

86.3 

83.8 

25 

a'a 

0.561 

Worked WPt study group of new teachers 

67.0 

24.2 

42.8" 

na 

0.000 

41.6 

19.2 

22.4' 

a'a 

O.OCO 

Worked with study group ot new and 

48.1 

41.8 

6.3 

a'a 

0.237 

54.3 

402 

14.1* 

a'a 

0.008 

experienced teachers 












Observed others leaching in their classrooms 

58.2 

48.6 

9.6 

a'a 

0.084 

48.7 

38.3 

10.3 

a'a 

l 

1.090 

Observed others leaching your class 

46.9 

47.0 

0.0 

a'a 

0.995 

38 5 

38.5 

0.1 

a'a 

l 

(.991 

Met with principal to discuss teaching 

74.5 

73.5 

1.0 

a'a 

0.817 

559 

53.5 

25 

a'a 

t 

1.665 

Met with Iteracy or mathematics coach or 












oilier curricular specialist 

67.8 

76.6 

-8.9 

a'a 

0.087 

67.8 

68.4 

•0.6 

a'a 

0.901 

Met with a resource specialist to dscuss 












needs d particular students 

67.6 

61.2 

6.4 

a'a 

0.173 

602 

689 

-8.7 

a'a 

0.072 

Frequency ot Selected Activities (Number of 












Times During Past Three Months) 












Teaching was observed by mentor 

3.4 

2.1 

1.3' 

0.56 

0.000 

2.3 

08 

1.6* 

0.73 

C 

i.OCO 

Teaching was observed by principal 

2.0 

2.4 

-0.4 

-0.22 

0.062 

1.8 

1.7 

0.1 

0.05 

l 

1.674 

Given feedback on your teaching, not as part 

2.8 

2.5 

0.3 

0.12 

0.266 

1.9 

1.5 

04" 

0.24 

t 

1.031 

of formal evakiaUon 












Given feedback on your teaching as part ot 

1.6 

1.5 

0.2 

0.14 

0.185 

0.9 

0.7 

02 

0.17 

0.079 

formal evaluation 












Given feedback on your lesson plans 

2.0 

2.0 

0.0 

-0.02 

0.886 

1.5 

1.7 

-0.2 

•0.09 

0.459 

Sample Size (Teachers) 

213 

182 

395 



191 

169 

360 





Source. Mathematics Fist and Ttird Induction Activities Surveys administered in fall 2005 and tall 2006 to all study teachers. 

Note Data pertain to teachers m two-year districts participating in the study Data are weighted and regressbn adjusted using ordinary least squares to account tor differences in 
districts, teacher grade assignments, study design, and the clustering ot teachers within schools. Sample sizes vary due to item nonresponse. 

■Effect sizes are reported for continuous measures but are not hdicated for dichotomous variables that are reported as percentages. 

' Significantly different trom zero at the 0.05 level. 

na ■ not applicable. 
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Table B.20. Impacts on Teacher-Reported Areas of Professional Development During Past Three Months. Fall 2005 and Fall 2006: 
Two-Year Districts 


Attended Professional Development Activities (Percentages) 
Fall 2005 Fall 2006 


Professional Development Topic 

Treatment 

Control 

Difference 

P-value 

Treatment 

Control 

Difference 

P-value 

Parent and community relations 

33.2 

30.5 

2.6 

0.580 

24.9 

17.5 

7.3 

0.138 

School policies on student disciplinary procedures 

43.6 

51.3 

-7.7 

0.151 

38.4 

43.2 

-4.8 

0.378 

Instructional techmquesf'strategies 

75.3 

79.3 

-4.0 

0.337 

65.6 

69.0 

-3.4 

0.467 

Understanding the composition of students in your class 

30.3 

23.1 

7.2 

0.142 

23.8 

18.8 

5.0 

0.268 

Content area knowledge (language arts, mathematics, 
science) 

63.5 

71.8 

-8.3 

0.064 

59.7 

55.7 

4.0 

0.411 

Lesson planning 

36.8 

37.0 

-0.2 

0.976 

32.8 

27.9 

4.9 

0.306 

Analyzing student worki'assessment 

44.7 

42.8 

1.9 

0.716 

42.2 

38.5 

3.7 

0.488 

Student motivation'engagement 

47.5 

38.8 

8.8 

0.116 

28.4 

24.7 

3.7 

0.433 

Differentiated instruction 

55.9 

46.8 

9.1 

0.121 

41.6 

41.2 

0.4 

0.939 

Using computers to support instruction 

35.0 

36.3 

-1.3 

0.798 

37.3 

34.0 

3.3 

0.510 

Classroom management techniques 

60.8 

47.8 

13.0* 

0.012 

28.1 

22.2 

5.9 

0.155 

Preparing students for standardized testing 

30.3 

35.7 

-5.5 

0.261 

28.0 

31.6 

-3.7 

0.476 

Sample Size (Teachers) 

213 

182 

395 


191 

169 

360 



Source: Mathematica First and Third Induction Activities Surveys administered in fal 2005 and fall 2006 to all study teachers. 

Note: Data pertain to teachers in two-year districts participating in the study. Data are weighted and regression adjusted using ordinary least squares to 

account for differences in districts, teacher grade assignments, study design, and the clustering of teachers within schools. Sample sizes vary due to 
item nonresponse. 

•Significantly different from zero at the 0.05 level. 
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Table B.21. Teacher Reports on Professional Support and Duties (Percentages), Spring 2006 and Spring 2007: Two-Year Districts 




Spring 2006 



Spring 2007 



Treatment 

Control 

Difference 

P-value 

Treatment 

Control 

Difference 

P-value 

BT has a mentor 

98.4 

85.5 

12.9* 

0.000 

87.4 

47.1 

40.3* 

0.000 

BT has an assigned mentor 

95.9 

78.9 

17.0" 

0.000 

83.8 

39.6 

44.3* 

0.000 

Sample Size (Teachers) 

210 

176 

386 


203 

169 

372 



Source: Mathematica Second Induction Activities Survey administered in spring 2006 to all study teachers and Fourth Induction Activities Survey 

administered in spring 2007 to study teachers in two-year districts. 

Note: Data pertain to teachers in two-year districts participating in the study. Data are weighted and regression adjusted using ordinary least squares to 

account for differences in districts, teacher grade assignments, study design, and the clustering of teachers within schools. Sample sizes vary due to 
item nonresponse. 

"Significantly different from zero at the 0.05 level. 

BT = beginning teacher. 
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Table B.22. Impacts on Teacher-Reported Mentor Profiles (Percentages), Spring 2006 and Spring 2007: Two-Year Districts 




Spnng 2006 



Spring 2007 



Mentoring Characteristic 

Treatment 

Control 

Difference 

P-value 

Treatment 

Control 

Difference 

P-value 

Number of Mentors 

Multiple Mentors 

37.8 

22.4 

15.4" 

0.005 

17.6 

12.6 

5.1 

0.197 

Number of Mentors 

None 

1.6 

14.5 

-12.9" 

0.000 

12.6 

52.9 

-40.3* 

0.000 

One 

61.0 

63.2 

-2.2 

0.722 

69.7 

34.6 

35.2" 

0.000 

Two 

31.5 

18.4 

13.1" 

0.016 

13.4 

12.6 

0.8 

0.829 

Number of Mentors Assigned 

No mentor assigned 

4.1 

21.1 

-17.0" 

0.000 

16.2 

60.4 

-44.3* 

0 

.000 

One mentor assigned 

64.6 

61.9 

2.7 

0.668 

73.6 

31.7 

41.9" 

0 

.000 

Two mentors assigned 

31.3 

17.0 

14.3" 

0.003 

10.3 

7.9 

2.4 

0.470 

Mentor Positions 

Positions of All Mentors 

Full-time mentor 

74.5 

16.6 

57.9" 

0.000 

67.4 

15.0 

52.3" 

0 

.000 

Teacher 

38.8 

65.4 

-26.6" 

0.000 

15.7 

26.9 

-11.2" 

0 

.025 

School or district administrator or staff 

14.1 

12.5 

1.6 

0.671 

10.9 

8.5 

2.4 

0.444 

external to district 

No mentor 

1.6 

14.5 

-12.9" 

0.000 

12.6 

52.9 

-40.3" 

0.000 

Sample Size (Teachers) 

210 

176 

386 


203 

169 

372 




Source: Mathematica Second Induction Activities Survey administered in spring 2006 to all study teachers and Fourth Induction Activities Survey 

administered in spring 2007 to study teachers in two-year districts. 

Note: Data pertain to teachers in two-year districts participating in the study. Data are weighted and regression adjusted using ordinary least squares to 

account for differences in districts, teacher grade assignments, study design, and the clustering of teachers within schools. Sample sizes vary due to 
item nonresponse. 

"Significantly different from zero at the 0.05 level. 
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Table B.23. Impacts on Teacher-Reported Mentor Services Received in the Most Recent Full Week of Teaching, Spring 2006 and 
Spring 2007: Two-Year Districts 


Spring 2006 Spring 2007 

Effect Effect 


Mentor Service 

Treatment 

Control 

Difference 

Sice" 

P-vtfue 

Treatment 

Control 

Difference 

Sice" 

P-value 

•Usual' Meetings with Mentors 












Frequency (number of meetings) 

1.6 

1.2 

0.3 

0.15 

0.172 

1.1 

0.7 

0.5" 

0.33 

0.003 

Average duration (minutes) 

23.4 

11.2 

12.1" 

0.66 

0.000 

19.5 

6.1 


13.4" 

0.81 

0.000 

Total time' (minutes) 

62.3 

43.2 

19.1 

0.21 

0.101 

50.3 

21.5 

28.8’ 

0.43 

0.000 

Informal Meetings with Mentors 












Total time (minutes) 

45.3 

39.1 

6.2 

0.14 

0.1 88 

28.4 

19.5 

8.9" 

0.25 

0.028 

Total Usual and Informal Time with Mentors 

107.6 

82.4 

25.3 

0.21 

0.087 

78.7 

41.0 

37.7" 

0.42 

0.001 

(Minutes) 












Meeting Time with Mentors in the Folkywing Positions 












(Minutes) 












FUi- time mentor 

70.8 

9.6 

61.2" 

0.80 

0.0OD 

54.3 

6.1 


48.2* 

0.77 

0.000 

Teacher 

31.6 

69.3 

-37.7" 

-0.35 

0.006 

18.5 

22.8 

-4.3 

-0.03 

0.524 

Administrator 

3.9 

2.5 

1.4 

0.10 

0.417 

62 

3.6 

2.5 

0.12 

0.239 

Staff extern^ to district 

0.8 

1.9 

-1.1 

-0.11 

0.281 

2.0 

1.7 

0.3 

0.03 

0806 

Mentor Time in the Following Activities (Minutes) 












Observing BT teaching 

25.6 

15.6 

10.0' 

0.30 

0.003 

19.1 

8.1 


11.1" 

0.48 

0GO0 

Meefcng with BT one-on-one 

38.2 

21.2 

17.0" 

0.53 

0.000 

292 

10.1 


19. 1" 

0.66 

0.000 

Meefcng with BT and Oliver first-year teachers 

34.8 

9.1 

25.8* 

0.65 

0.000 

23.7 

4.5 

19.2" 

0.56 

0.000 

Meeting with BT and Oliver teachers 

22.4 

17.0 

5.4 

0.16 

0.137 

15.5 

8.1 


7.3" 

0.24 

0.024 

Modeling a lesson 

14.1 

8.0 

6.1* 

0.23 

0.027 

10.0 

3.6 


6.4’ 

0.33 

0.005 

Co-teaching a lesson 

10.8 

6.5 

4.3 

0.16 

0.082 

7.8 

1.£ 

- 

6.3" 

0.40 

0.000 

All six activities (al mentors) 

146.3 

76.9 

69.3" 

0.50 

0.000 

105.3 

36.C 


69.4* 

0.62 

0.000 

All six activities (study mentor only) 

108.7 

0.0 

106.7" 

0.99 

0.000 

82.3 

O.C 


82.3" 

0.87 

0.000 

Types of Assistance a Mentor Provided (Percentage) 












Suggestions to improve practice 

83.2 

52.5 

30.7" 

n'a 

0.000 

68.0 

27.4 


40.6* 

nfa 

0000 

Encouragement or moral support 

92.4 

70.4 

22.0" 

rVa 

0.000 

77.9 

37.6 


40.4' 

nfa 

0000 

Opportunity to raise issuesdiscuss concerns 

90.0 

62.3 

27.7" 

rVa 

0.000 

76.1 

36.: 

■ 

39.7" 

Ha 

ocoo 

Help with administrativei'fcgislical issues 

76.6 

53.2 

23.4" 

rVa 

0.000 

59.6 

29. C 


30.6" 

nfa 

ocoo 

Hetp with teaching to meet state or district 

69.6 

47.7 

21.9" 

n'a 

0.000 

58.5 

25.: 


33.3" 

nfa 

0.000 

standards 












Help identifying teaching chalenges and solutions 

80.7 

51.8 

28.9" 

n'a 

0.000 

66.0 

29.5 

36.5* 

nfa 

0.000 

Discussed instructional goafs and ways to achieve 

79.1 

48.1 

31.0" 

nfa 

0.000 

65.5 

24.4 

41.0" 

nfa 

0.000 

them 












Guidance on how to assess students 

72.3 

43.5 

28.8" 

n'a 

0.000 

58.3 

19.1 


39.2" 

nfa 

0.000 

Shared lesson plans, assignments, or other 

71.0 

50.5 

20.5* 

nfa 

0.000 

59.7 

22.3 

37.4" 

nfa 

0.000 

instructional activities 












Acted cn BT's request' 

75.9 

54.2 

21.7" 

n'a 

0.000 

604 

23.5 

37.0" 

nfa 

ocoo 

Sample Sice (Teachers) 

210 

176 

386 



203 

169 

372 





Table B.23 (continued) 


Source: Mathematics Second Induction Activities Survey administered in spring 2006 to all study teachers aid Fourth Induction Activities Survey administered in spring 2007 to 

study teachers in two-year restricts. 

Note Data pertain to teachers m two-year districts participating in the study. Data are weighted and regression adjusted using ordinary least squares to account (or differences in 
districts, teacher grade assignments, study design, end the ckistecing ol teachers within schools. Sample sizes vary due to item nonresponse. 

■Effect sizes are repotted (or continuous measures but are not hdicated (or dichotomous variables that are reported as percentages. 

'The product o t the mean frequency and mean average duration does not necessarily equal the mean of total time. 

‘Total sample size is 306 in spring 2CC6 and 325 in sprng 2007. The question did not apply to teachers who did not make a request to Iher mentors. 

’Significantly different from zero at the 0.05 level. 

BT • beginning teacher: n'a • not applicable. 
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Table B.24. Impacts on Teacher-Reported Professional Development Activities During Past Three Months, Spring 2006 and Spring 2007: Two-Year 
Districts 


Spring 2006 Spring 2007 


Aspect of Professional Development 

Treatment 

Conlrol Difference 

Eflecl 

Size 1 

P-value 

Trealmenl 

Centre* 

Diffetenc 

e 

Effect 

Size* 

P-value 

Activities Completed (Percentages) 

Kept a written log 

41.6 

26.1 

15.5* 

nfa 

0.003 

33.1 

24.7 

8.4 

iVa 

0.081 

Kept a portfolio and analysis of student work 

78.6 

76.3 

2.3 

nfa 

0.580 

83.1 

76.4 

6.7 

n'a 

0.131 

Worked with a study group of new teachers 

64.1 

24.9 

39.2' 

nta 

0.000 

48.2 

14.1 

34. V 

n'a 

0.000 

Worked with a study group of new and 

48.3 

33.9 

14.4’ 

i\ta 

0.003 

51.1 

35.9 

15.2' 

n'a 

0.004 

experienced teachers 

Observed others teaching in (her 

71.5 

46.7 

24.9’ 

n fa 

0.000 

47.0 

35.2 

11.9* 

n'a 

0.020 

classrooms 

Observed others teaching your class 

48.0 

36.0 

12.0’ 

nfa 

0.029 

36.1 

36.4 

-0.3 

n'a 

0.953 

Met with principal to discuss teaching 

73.2 

70.1 

3.1 

n'a 

0.541 

68.9 

65.4 

3.5 

n'a 

0.460 

Mel with a iteracy or mathematics coach or 
oilier curricular specialist 

66.2 

63.9 

2.3 

n fa 

0.665 

63.1 

63.1 

0.0 

n'a 

0.999 

Met with a resource specialist to discuss 
needs d particular students 

64.0 

59.1 

4.9 

n'a 

0.347 

62.2 

65.8 

•3.6 

n'a 

0.474 

Frequency ol Selected Activities (Number of 
Trues Di»ing Past Three Months) 

Teaching was observed by mentor 

3.2 

1.6 

1.6’ 

0.69 

0.000 

2.5 

1.0 

1.5' 

0.66 

0.0OD 

Teaching was observed by principal 

2.3 

1.9 

0.4 

0.19 

0.121 

2.0 

1.8 

0.2 

0.10 

0.354 

Given feedback on your teaching, not as part 

2.5 

2.0 

0.5' 

024 

0.031 

2.2 

1.5 

0.7' 

0.37 

0.001 

of formal evaluation 

Given feedback on your leaching, as pan of 

1.8 

1.5 

0.3 

0.18 

0.093 

1.6 

1.3 

0.3' 

0.21 

0.046 

formal evaluation 

Given feedback on your lesson plans 

1.9 

1.6 

0.3 

0.15 

0.175 

1.5 

1.5 

0.0 

0.01 

0.964 

Sample Size (Teachers) 

210 

176 

386 



203 

169 

372 




Source: Mathematica Second Induction Activities Survey administered ti spring 2C06 to all study teachers and Fourth Induction Activities Survey administered in spring 2007 to 

study teachers in two-year districts. 

Note Data perlan to teachers in two-year districts partcCaling in the study. Data are weighted and regression adjusted using ordinary least squares to account tee 

differences in dstricls. teacher grade assignments, study design, and the clustering of teachers within schools. Sample sizes vary due to item nonresponse. 

"Effect sires are reported for continuous measures but are not hdicated for dichotomous variables that ere reported as percentages. 


'Significantly different from zero at the 0.05 level, 
ma • not applicable. 
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Table B.25. Impacts on Teacher-Reported Areas of Professional Development During the Past Three Months, Spring 2006 and Spring 2007: Two-Year 
Districts 


Attended Professional Development Activities (Percentages) 


Spring 2006 Spnng 2007 


Professional Development Topic 

Treatment 

Control 

Offer ence 

P-value 

Treatment 

Control 

Difference 

P-value 

Parent and community relations 

28.2 

24.3 

3.9 

0.423 

27.7 

23.6 

4.0 

0.438 

School policies on student disciplinary procedures 

39.0 

34.4 

4.5 

0.320 

36.0 

36.7 

-0.7 

0.893 

Instructional technique&'strategies 

80.4 

73.2 

7.2 

0.154 

74.4 

72.2 

2.1 

0.662 

Understanding the composition of students in your 
class 

31.6 

21.5 

10.1* 

0.033 

24.9 

23.7 

1.2 

0.811 

Content area knowledge (language arts, 
mathematics, science) 

69.9 

60.3 

9.6* 

0.040 

62.2 

57.5 

4.7 

0.355 

Lesson planning 

42.9 

31.2 

11.7* 

0.019 

37.5 

27.8 

9.6- 

0.038 

Analyzing student worfeassessment 

60.4 

40.6 

19.8* 

0.000 

56.5 

45.0 

11.5* 

0.034 

Student motivation'engagement 

42.7 

33.5 

9.1 

0.071 

42.7 

23.6 

19.0- 

0.000 

Differentiated instruction 

62.0 

47.0 

15.0- 

0.010 

58.4 

43.3 

15.1* 

0.006 

Us«vg computers to support «*structicn 

36.0 

34.3 

1.7 

0.727 

37.9 

40.9 

-3.0 

0.601 

Classroom management techniques 

53.3 

33.8 

19.5* 

0.000 

26.1 

21.9 

4.2 

0.347 

Preparng students for standardized testing 

43.8 

50.5 

-6.8 

0.152 

49.1 

48.0 

1.1 

0.838 

Sample Size (Teachers) 

210 

176 

386 


203 

169 

372 



Source: Mathematics Second Induction Activities Survey administered In spring 2006 to all study teachers and Fourth Induction Activities Survey administered ti 

spring 2007 to study teachers bi two-year districts. 

Note: Data pertain to teachers m two-year distncts participating «i the study. Data are weighted and regression adjusted usmg ordinary least squares to account (or 

differences In districts, teacher grade assignments, study design, and the clustering o( teachers within schools. Sample sizes vary due to item nonresponse. 

•Significantly different from zero at the 0.05 level. 
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Table B.26. Teacher Reports on Professional Support and Duties (Percentages), Fall 2007 and Fall 2008: Two-Year Districts 




Fall 2007 




Fall 2008 



Treatment 

Control 

Difference 

P-value 

Treatment 

Control 

Difference 

P-value 

BT has mentor 

14.5 

23.3 

-8.8 

0.066 

9.7 

17.0 

-7.3' 

0.043 

BT has assigned mentor 

11.3 

18.8 

-7.5 

0.077 

6.9 

11.8 

-4.8 

0.146 

Sample Size (Teachers) 

179 

147 

326 


178 

143 

321 



Source: Mathematics Fifth and Sixth Induction Activities Surveys administered in fall 2007 and fa'i 2008 to all study teachers. 

Note: Data pertain to teachers m two-year distncts participating «i the study. Data are weighted and regression adjusted using ordinary least squares to account for 

differences in districts, teacher grade assignments, study design, and the clustering of teachers within schools. Sample sizes vary due to item nonresponse. 

•Significantly different from zero at the 0.05 level. 

BT - beginning teacher. 
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Table B.27. Impacts on Teacher-Reported Mentor Profiles (Percentages). Fall 2007 and Fall 2008: Two-Year Districts 




Fall 2007 



Fall 2008 


Mentoring Characteristic 

Treatment 

Control Difference 

P-value 

Treatment 

Control Difference 

P-value 


Number of Mentors 

None 

85.5 

76.7 

8.8 

0.066 

90.3 

83.0 

7.3* 

0.043 

One 

11.3 

19.0 

-7.7 

0.091 

9.1 

13.7 

-4.5 

0.162 

Two or more 

3.3 

5.0 

-1.6 

0.495 

2.4 

6.1 

-3.7 

0.126 

Number of Mentors Assigned 

None 

88.6 

81.1 

7.6 

0.076 

93.0 

87.9 

5.1 

0.135 

One or more 

11.4 

18.9 

-7.6 

0.076 

7.0 

12.1 

-5.1 

0.135 

Mentor Positions 

Teacher 

7.6 

17.1 

-9.5* 

0.023 

6.4 

10.6 

-4.2 

0.179 

Other position 

4.2 

6.6 

-2.4 

0.368 

2.4 

6.2 

-3.8 

0.122 

No mentor 

85.5 

76.7 

8.8 

0.066 

90.3 

83.0 

7.3* 

0.043 

Sample Size (Teachers) 

179 

147 

326 


178 

143 

321 



Source: Mathematica Fifth and Sixth Induction Activities Surveys administered in fall 2007 and fall 2008 to al study teachers. 

Note: Data pertain to teachers in two-year districts participating in the study. Data are weighted and regression adjusted using ordinary least squares to 

account for differences in districts, teacher grade assignments, study design, and the clustering of teachers within schools. Sample sizes vary due to 
item nonresponse. 

"Significantly different from zero at the 0.05 level, 
rv'a = not applicable. 



Table B.28. Impacts on Teacher-Reported Mentor Services Received in Most Recent Full Week of Teaching. Fall 2007 and Fall 2008: 
Two-Year Districts 


Fall 2007 Fal 2008 


Elfect Effect 


Mentor Service 

Treatment 

Control 

Difference 

Size' 

P-vakie 

Treatment 

Control 

Otference 

Size* 

P-va/ue 

•Usual' Meetings with Mentors 











Frequency (number of meetings) 

0 2 

0.3 

-0.1 

-0.11 

0.337 

0.1 

0.3 

-02 

-0.19 

0.182 

Average duration (minutes) 

1.7 

2.7 

-1.0 

-0.12 

0.352 

1.2 

2.1 

-0.9 

•0.13 

0.334 

Total time' (minutes) 

6.7 

8.7 

-2.0 

-0.06 

0.630 

5.7 

8.5 

•2.8 

-0.09 

0.540 

Informal Meetings with Mentors 











Tan time (minutes) 

5.7 

5.9 

-0.2 

•0.01 

0.939 

4.9 

9.4 

-4.5 

•022 

0.114 

Total Usual and Informal Time with Mentors 
(Minutes) 

12.4 

14.5 

•2.1 

-0.04 

0.731 

10.5 

17.8 

-7.3 

-0.15 

0263 

Meetng Time with Mentors in the Following 

Positions (Minutes) 











Fiil -time mentor 

0.3 

1.6 

-1.3 

■0.12 

0.307 

0.2 

2.5 

-2.3 

-021 

0.171 

Teacher 

10.6 

13.7 

-3.0 

-o.oa 

0.642 

9.6 

10.8 

-1.2 

-0.03 

0.845 

Administrator 

0 2 

0.4 

-0.2 

-0.07 

0.401 

0.0 

2.7 

•2.7 

•0.26 

0.081 

Staff external to district 

0.0 

0.0 

0.0 



0.5 

0.8 

•0.3 

-0.04 

0.775 

Mentor Time in the Following Activities (Minutes) 











Observing BT leaching 

3.7 

1.4 

2.3 

0.17 

0.183 

3.0 

3.0 

•0.1 

•0.01 

0.969 

Meeting with BT one-on-one 

7.0 

4.2 

2.8 

0.16 

0.217 

2.6 

4.4 

-1.8 

•0.14 

0.332 

Meeting with BT and other first-year teachers 

3.8 

1.4 

2.4 

0.18 

0.160 

1.2 

3.0 

-1.9 

-0.17 

0.282 

Meetng with BT and other teachers 

2.9 

5.2 

•2.3 

-0.11 

0.394 

1.4 

62 

-4.8 

•0.30 

0.058 

Modeling a lesson 

2.0 

1.6 

0.3 

0.03 

0.813 

0.7 

1.4 

-0.7 

-0.09 

0.562 

Co-teaching a lesson 

1.3 

0.4 

0.9 

0.12 

0.306 

0.0 

02 

-0.1 

-0.13 

0.326 

All six activities (al mentors) 

20.6 

14.2 

6.4 

0.10 

0.422 

9.1 

17.4 

•8.3 

-0.17 

0.266 

AJI six activities (study mentor only) 

5.1 

0.0 

5.1 

023 

0.201 

0.2 

0.0 

0.2 

0.0B 

0.338 

Types of Assistance Mentor Provided (Percentage) 











Suggestions to improve practice 

7.9 

12.8 

-4.9 

n/a 

0.186 

5.6 

10.1 

-4.5 

n/a 

0.150 

Encouragement or moral support 

13.1 

17.3 

-4.2 

n’a 

0.334 

8.2 

13.8 

•5.6 

n/a 

0.096 

Opportunity to raise issues'discuss concerns 

122 

16.4 

-42 

ru’a 

0.301 

7.2 

12.6 

•5.4 

n/a 

0.102 

Help with administrativ&logistical issues 

10.3 

13.1 

•2.8 

ru’a 

0.412 

4.8 

9.3 

-4.5 

n/a 

0.127 

Help teaching to meet state or district standards 

7.8 

10.8 

-2.9 

n'a 

0.410 

5.4 

8.8 

-3.4 

n/a 

0.243 

Hetp identifying teaching chalenges and 
solutions 

10.3 

12.4 

-2.1 

n/a 

0.558 

4.1 

8.9 

-4.9 

n/a 

0.069 

Discussed instructional goats and ways to 
achieve them 

10.1 

10.2 

•0.1 

n/a 

0.966 

4.3 

8.8 

-4.5 

n/a 

0.106 

Guidance on how to assess students 

8.0 

10.8 

•2.9 

n/a 

0.383 

5.4 

6.1 

-2.7 

n/a 

0.324 

Shared lesson plans, assignments, or other 
instructional activities 

9.1 

11.6 

•2.4 

n/a 

0.5(0 

4.9 

13.4 

•8.5‘ 

n/a 

0.005 

Acted on BT's request" 

8.6 

8.1 

0.5 

n/a 

0.886 

5.7 

10.0 

-4.3 

n/a 

0.132 

Sample Size (Teachers) 

179 

147 

326 



178 

143 

321 




Source. Mathematics Fifth and Sixth Induction Activities Surveys administered rt fal 2007 and fall 2008 to al study teachers. 



Table B.28 (continued) 


Note Data pertain to teachers in two-year districts participating in (he study Data are weighted and regression adjusted using ordinary least squares to account lor differences 
districts, teacher grade assignments, study design, arid the ckustenng of teachers within scitools. Sample sizes vary due to item ncruesponse. 

*EfTect sizes are reported for continuous measures but are not ndicated for dichotomous variables that are reported as percentages. 

'The product of the mean frequency and mean average duration does not necessaniy equal the mean of total time. 

“Total sample size is 305 in fail 2007 and 318 in fall 2008. The question did not apply to teachers who dd not make a request to their mentors. 

■Significantly different from zero at the 0.05 level. 

BT • beginning teacher; n/a • not applicable. 
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Table B.29. Impacts on Teacher-Reported Professional Development Activities During Past Three Months, Fall 2007 and Fall 2008: 
Two-Year Districts 





Fall 2C07 




Fall 2038 


Aspect of Professional Development 

Treatment 

Control 

Difference 

Effect 

Size" P-value 

Treatment 

Control 

Difference 

Effect 

Size" P-value 

Activities Completed (Percentages) 

Kept written leg 

26.6 

24.8 

1.8 

n/a 0.724 

26.9 

20.6 

6.3 

n/a 0.169 

Kept portfolio and analysis of student work 

84.4 

81.1 

3.3 

n/a 0.475 

80.5 

84.5 

-4.0 

n/a 0.324 

Worked Mh study group of new teachers 

13.8 

11.6 

2.2 

n/a 0.539 

12.2 

17.9 

-5.7 

n/a 0.221 

Waked with study group of new and 

35.9 

34.9 

1.0 

n/a 0.863 

38.9 

38.4 

0.5 

n/a 0.932 

experienced teachers 

Observed others teaching in their classrooms 

29.7 

32.8 

•3 2 

m'a 0.569 

29.5 

37.8 

-8.4 

n/a 0.136 

Observed others leaching your class 

28.0 

34.2 

•6.1 

n/a 0.323 

23.3 

28 8 

-5.5 

n/a 0.261 

Met with principal to discuss teaching 

59.7 

52.5 

72 

n/a 0.198 

50.4 

50.7 

•0.3 

m’a 0.953 

Met vwth Iteracy or mathematics coach a 
alter curricular specialist 

62.6 

63.7 

-1.2 

n/a 0.845 

67.3 

70.5 

-3.2 

n/a 0.500 

Met with a resource specialist to discuss 
needs of partiojlar students 

66.9 

63.2 

3.8 

n/a 0.476 

64.2 

690 

-4.8 

n/a 0.367 

Frequency of Selected Activities (Number of 

Times During Past Three Months) 

Teaching was observed by mentor 

0.3 

0.4 

•0.1 

•0.07 0.497 

0.2 

02 

•0.1 

-0.08 0.528 

Teaching was observed by principal 

1.7 

1.4 

0.3 

0.18 0.068 

1.5 

1.4 

0.1 

0.05 0.657 

Given feedback on your teaching, not as part 
of formal evafciation 

1.3 

1.1 

02 

0.12 0.303 

1.5 

1.0 

0.5* 

0.30 0.007 

Given feedback on ycxir teaching as part of 
formal evaluation 

0.7 

0.7 

0.0 

•0.03 0.819 

0.7 

0.6 

0.1 

0.10 0.369 

Given feedback on your lesson plans 

1.3 

1.4 

0.0 

•0.02 0.886 

1.5 

1.5 

0.0 

•0.01 0.905 

Sample Size (Teachers) 

179 

147 

326 


178 

143 

321 



Source: Mathematics Fifth and Sixth Induction Activities Surveys administered rt (al 2007 and tall 2009 to al study teachers. 

Note Data pertain to teachers m two-year districts participating in the study Data are weighted and regression adjusted using ordinary least squares to account lor ditTerences in 
districts, teacher grade assignments, study design, and the cKistering ot teachers within schools. Sample sizes vary due to item nonresponse. 

"Effect sizes are reported for continuous measures but are not indicated tor dichotomous variables that are reported as percentages. 

‘Sgnificandy different from zero at the 0.05 level. 

na • not applicable. 
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Table B.30. Impacts on Teacher-Reported Areas of Professional Development During Past Three Months, Fall 2007 and Fall 2008: 
Two-Year Districts 


Attended Professional Development Activities (Percentages) 

Fall 2007 Fall 2008 


Professional Development Topic 

Treatment 

Control 

Difference 

P- value 

Treatment 

Control 

Difference 

P-value 

Parent and community relations 

15.4 

19.5 

-4.1 

0.340 

19.0 

15.2 

3.7 

0.388 

School policies on student disciplinary procedures 

43.7 

41.4 

2.3 

0.714 

44.1 

52.0 

-7.8 

0.178 

Instructional techntques/strategies 

77.5 

64.3 

13.3* 

0.008 

78.2 

77.7 

0.5 

0.920 

Understanding the composition of students in your 
class 

21.4 

21.0 

0.4 

0.933 

24.7 

24.8 

-0.1 

0.982 

Content area knowledge (language arts, mathematics, 
science) 

66.9 

61.8 

5.2 

0.319 

67.7 

57.9 

9.8 

0.060 

Lesson planning 

30.3 

31.8 

-1.5 

0.785 

27.6 

29.0 

-1.3 

0.795 

Analyzing student worki'assessment 

46.0 

38.0 

7.9 

0.131 

54.7 

39.4 

15.3* 

0.011 

Student mo tivaton 'engagement 

24.0 

21.4 

2.6 

0.568 

29.3 

23.3 

6.0 

0.203 

Differentiated instruction 

53.0 

50.2 

2.8 

0.640 

46.1 

46.3 

-0.1 

0.984 

Using computers to support instruction 

38.1 

42.6 

-4.5 

0.440 

39.7 

42.3 

-2.7 

0.642 

Classroom management techniques 

26.3 

21.6 

4.6 

0.346 

23.2 

22.5 

0.7 

0.889 

Preparing students for standardized testing 

20.9 

28.3 

-7.4 

0.082 

33.1 

30.5 

2.6 

0.628 

Sample Size (Teachers) 

179 

147 

326 


178 

143 

321 



Source: Mathematica Fifth and Sixth Induction Activities Surveys administered in fall 2007 and fall 2008 to al study teachers. 

Note: Data pertain to teachers in two-year districts participating in the study. Data are weighted and regression adjusted using ordinary least squares to 

account for differences in districts, teacher grade assignments, study design, and the clustering of teachers within schools. Sample sizes vary due to 
item nonresponse. 

"Significantly different from zero at the 0.05 level. 
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Figure B.l. Treatment-Conlrol Differences in Total Minutes Spent in Mentoring per Week. Fall 2005: One-Year 
Districts 
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District 

Source: Mathematics First Induction Activities Survey administered in fall 2005 to al study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Data are weighted and 

regression adjusted using ordinary least squares to account for differences in benchmark covariates 
and robust variance estimation to account for study design and the clustering of teachers within 
schools. Plot symbols represent the difference between regression-adjusted treatment and control 
mean within each district, and the vertical lines show the 95 percent confidence interval around each 
point. District codes A through J are arbitrary. Districts are ordered according to the size of the 
treatment-control difference. N=503 teachers. 

"Significantly different from zero at the 0.05 level. (No adjustment is applied for multiple comparisons.) 
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Figure B.2. Treatment-Control Differences in Total Minutes Spent in Mentoring per Week. Fall 2006: One-Year 
Districts 
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Source: Mathemabca Third Induction Activities Survey administered in fall 2006 to all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Data are weighted and 

regression adjusted using ordinary least squares to account for differences in benchmark covariates 
and robust variance estimation to account for study design and the clustering of teachers within 
schools. Plot symbols represent the difference between regresson-adjusted treatment and control 
mean within each district, and the vertical lines show the 95 percent confidence interval around each 
point. District codes A through J are arbitrary. Districts are ordered according to the size of the 
treatment-control difference. N=472 teachers. 

"Sgnificantly different from zero at the 0.05 level. (No adjustment is applied for multiple comparisons.) 
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Figure B.3. Treatment-Conlrol Differences in Total Minutes Spent in Mentoring per Week. Fall 2005: Two-Year 
Districts 
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Source: Mathematica First Induction Activities Survey administered in fall 2005 to ah study teachers. 

Note: Data pertain to teachers in trio-year districts participating in the study. Data are weighted and 

regression adjusted using ordinary least squares to account for differences in benchmark covariates 
and robust variance estimation to account for study design and the clustering of teachers within 
schools. Plot symbols represent the difference between regresson-adjusted treatment and control 
mean within each district, and the vertical lines show the 95 percent confidence interval around each 
point. District codes K through Q are arbitrary. Districts are ordered according to the size of the 
treatment-control difference. N=395 teachers. 

"Significantly different from zero at the 0.05 level. |No adjustment is applied for multiple comparisons.) 
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Figure B.4. Treatment-Conlrol Differences in Total Minutes Spent in Mentoring per Week. Fall 2006: Two-Year 
Districts 
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Source: Mathemabca Third Induction Activities Survey administered in fall 2006 to all study teachers. 

Note: Data pertain to teachers in two-year districts participating in the study. Data are weighted and 

regression adjusted using ordinary least squares to account for differences in benchmark covariates 
and robust variance estimation to account for study design and the clustering of teachers within 
schools. Plot symbols represent the difference between regresson-adjusted treatment and control 
mean within each district, and the vertical lines show the 95 percent confidence interval around each 
point. District codes K through Q are arbitrary. Districts are ordered according to the size of the 
treatment-control difference. N=360 teachers. 

‘Sgnificantly different from zero at the 0.05 level. (No adjustment is applied for multiple comparisons.) 
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APPENDIX C 

SENSITIVITY ANALYSES AND SUPPLEMENTAL INFORMATION 

FOR CHAPTER V 

Chapter V presented estimates of the impacts of comprehensive induction on classroom 
practices and student test scores. This appendix supplements that analysis with additional results on 
the sensitivity of the benchmark findings to alternative assumptions and with additional tables 
providing detail on the sample sizes used to produce the estimates. 

A. Supplementary Information and Sensitivity Analysis for Classroom Practices 

We re-estimated the impacts on classroom practices by using a variety of assumptions about 
item scoring and estimation and found that the results did not change substantially. 

The results were not sensitive to how we grouped the individual items into constructs. We 
performed a factor analysis of the 16 classroom observation items to explore the degree to which 
the theoretical groupings were empirically justified. In finding the groupings justified, we maintained 
the three-construct scoring method (implementation of literacy lesson, content of literacy lesson, 
and classroom culture) described in Chapter V. Although the factor analyses were consistent with 
the theoretical groupings, they did suggest that the implementation and content items could he 
grouped together, forming one construct rather than two. (Factor loadings for the 16 class 
observation items arc shown in Table A.5 in Appendix A.) W hen we substituted a single construct 
that included all implementation and content items in place of two constructs, there were no 
statistically significant impacts. 

The findings did not change when we collapsed the scale or divided the sample into two 
subgroups. As part of our sensitivity analyses, we estimated the model separately lor each classroom 
observation item after recoding each score Irom a five-point scale into a binary variable: (1) no, 
limited, or moderate evidence or (2) consistent or extensive evidence of good practice. This 
dichotomous coding scheme allowed us to compare the percentages of treatment and control 
teachers who demonstrated "consistent” or “extensive” evidence of good practice in the classroom. 
The results, however, support the same conclusions of no impact. Impact estimates lor each of the 
16 class observation items arc shown in Table C.l. 

The results were not sensitive to the choice of summary score. In addition to scoring individual 
items under each domain, classroom observers reported a summary score for each of the three 
domains. They based the summary score on a five-point scale that could ditfer from our constructed 
domain scores in two ways. First, they reported the score as an integer such that they had to round 
oft to the nearest whole number and thus could have recorded numbers that differ trom the average 
score. Second anti more significant, observers could exercise their discretion in assigning an overall 
domain score. Thus, if indicator scores were .3, .3, 3, 4, and 4 lor the five indicators, respectively, 
then an observer, in reporting the overall domain score, could round up to 4 instead of down to .3 
based on a judgment that the last two domains arc more important than the first three. Observers 
could also justilv an overall score ol 4 if the item scores of .3 were actually rounded down from, say, 
.3.4 and the item scores of 4 had been rounded down from 4.4. (The average of .3.4, 3.4, 3.4, 4.4, and 
4.4 is 3.8, which rounds up to 4.) 
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The two types of summary scores were nol identical. Given that each has advantages and 
disadvantages, we had to choose one arbitrarily to include for the benchmark estimates presented in 
Chapter V. When we substituted the observer summary scores for the computed average scores, we 
reached the same conclusions: no statistically significant impact of treatment (see Table C.2). 

We also estimated the model separately for one-year and two-year districts (see Tables C.3 and 
C.4) and lound that the impact estimates were not significantly different from zero. 

I listograms tor treatment anti control teachers' performance in each of the three domains arc 
shown in Figures C.l— C.3. These histograms illustrate the pattern of variation (or distribution) of 
the classroom practices data. 

Impact estimates for literacy implementation scores are presented separately by district in 
Figures C.4— C.5. 
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Table C.l. Impacts on Classroom Practices (Percentages with Consistent or Extensive Evidence of Practice): 
One-Year Districts and Two-Year Districts Combined. 2005-2006 School Year 


Classroom Observation Item 

Treatment 

Control 

Difference 

P-value 

Implementation of Literacy Lesson 

Best practices 

23.4 

27.2 

-3.8 

0.306 

Instructional choices 

28.8 

30.7 

-1.8 

0.614 

Student choices 

18.2 

18.4 

-0.2 

0.952 

Pace 

24.2 

26.3 

-2.1 

0.559 

Student-student interaction 

16.8 

15.5 

1.3 

0.682 

Content of Literacy Lesson 

Understanding content and close reading 

23.5 

25.4 

-1.9 

0.593 

Assessment 

7.2 

7.4 

-0.2 

0.935 

Skill development 

17.9 

17.8 

0.1 

0.983 

Connections between reading and writing 

15.9 

17.0 

-1.1 

0.737 

Classroom Culture 

Maximizes learning opportunities 

44.4 

46.4 

-2.0 

0.619 

Routines dear and consistent 

46.1 

49.4 

-3.3 

0.434 

Behavior respectable, atmosphere safe 

45.3 

44.0 

1.2 

0.756 

Literacy valued 

28.1 

31.1 

-3.0 

0.429 

Teacher works collaboratively 

39.5 

37.2 

2.2 

0.594 

with students 

Students work collaboratively 

25.0 

23.8 

1.2 

0.735 

with other students 

Equal access to teacher and resources 

41.3 

46.0 

-1.6 

0.291 

Sample Size (Teachers) 

342 

289 




Source: Mathematica Teacher Background Survey administered in fall 2005 to all study teachers: Mathematics 

classroom observations conducted in spring 2006 . 

Note: Data pertain to teachers in all districts participating in the study. Data are weighted and regression 

adjusted to account for differences in baseline characteristics and the study design. 

None of the differences is statistically significant at the 0.05 level. 
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Table C.2. Impacts on Classroom Practices (Observer Summary Scores): One-Year Districts and Two-Year 
Districts Combined, 2005-2006 School Year 


Outcome 

Treatment 

Control 

Difference 

Effect Size 

P-value 

Implementation of literacy 
lesson 

2.7 

2.7 

0.0 

-0.01 

0.942 

Content of literacy lesson 

2.5 

2.5 

0.0 

-0.01 

0.859 

Classroom culture 

3.1 

3.0 

0.0 

0.02 

0.804 

Sample Size (Teachers) 

342 

289 





Source: Mathematica Teacher Background Survey administered m fall 2005 to all study teachers: Mathematics 

classroom observations conducted in spring 2006 . 

Note: Data pertain to teachers in all districts participating in the study. Data are weighted and regression 

adjusted to account for differences in baseline characteristics and the study design. Scoring scale: (1) 
no evxlence. (2) limited evidence, (3) moderate evidence. (4) consistent evidence, or (5) extensive 
evidence. 

None of the differences is statistically significant at the 0.05 level. 


Table C.3. Impacts on Classroom Practices (Average Score on a Five-Point Scale): One-Year Districts. 2005- 
2006 School Year 


Outcome 

Treatment 

Control 

Difference 

Effect 

Size 

P-value 

Implementation of literacy lesson 

2.7 

2.7 

0.0 

-0.05 

0.646 

Content of literacy lesson 

2.4 

2.4 

0.0 

-0.03 

0.774 

Classroom culture 

3.1 

3.1 

0.0 

0.04 

0.720 

Sample Size (Teachers) 

178 

164 





Source: Mathematca Teacher Background Survey administered in fall 2005 to all study teachers: Mathematics 

classroom observations conducted in spring 2006 . 

Note: Data pertain to teachers in one-year districts participating in the study. Data are weighted and 

regression adjusted to account for differences in baseline characteristics and the study design. Scoring 
scale: (1) no evidence. (2) limited evidence. (3) moderate evidence. (4) consistent evidence, or (5) 
extensive evidence of effectrve teaching practice. 

None of the differences is statistically significant at the 0.05 level. 
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Table C.4. Impacts on Classroom Practices (Average Score on a 5-Point Scale): Two-Year Districts, 2005- 
2006 School Year 


Outcome 

Treatment 

Control 

Difference 

Effect Size 

P-value 

Implementation of literacy lesson 

2.7 

2.6 

0.1 

0.08 

0.467 

Content of literacy lesson 

2.3 

2.3 

0.0 

0.01 

0.935 

Classroom culture 

3.0 

3.0 

0.0 

0.04 

0.774 

Sample Size (Teachers) 

164 

125 





Source: Mathematics Teacher Background Survey administered in fall 2005 to all study teachers: Mathematics 

classroom observations conducted in spring 2006 . 

Note: Data pertain to teachers in two-year districts participating in the study. Data are weighted and 

regression adjusted to account for differences in baseline characteristics and the study design. Scoring 
scale: (1) no evidence, (2) limited evidence. (3) moderate evidence. (4) consistent evidence, or 
(5) extensive evidence of effective teaching practice. 

None of the differences is statistically significant at the 0.05 level. 
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Figure C.l. Distribution of Literacy Implementation Scores: One-Year Districts and Two-Year Districts 
Combined. 2005-2006 School Year 




Control |n=293) 


Source: Mathematics Teacher Background Survey administered in fall 2005 to all study teachers; Mathematica 

classroom observations conducted in spring 2006 . 

Note: Data pertain to teachers in all districts participating in the study. Data are weighted and regression 

adjusted to account for differences in baseline characteristics and the study design. Scoring scale: (1) 
no evidence. (2) limited evidence. (3) moderate evidence. (4) consistent evidence, or (5) extensive 
evidence of effective teaching practice. 


Figure C.2. Distribution of Literacy Content Scores: One-Year Districts and Two-Year Districts Combined, 
2005-2006 School Year 




Control (n-293) 


Source: Mathematica Teacher Background Survey administered in fall 2005 to all study teachers: Mathematica 

classroom observations conducted in spring 2006 . 

Note: Data pertain to teachers in all districts participating in the study. Data are weighted and regression 

adjusted to account for differences in baseline characteristics and the study design. Scoring scale: (1 > 
no evidence. (2) limited evidence. (3) moderate evidence. (4) consistent evidence, or (5) extensive 
evidence of effective teaching practice. 
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Figure C.3. Distribution of Literacy Culture Scores: One-Year Districts and Two-Year Districts Combined, 
2005-2006 School Year 



Jim im iu 


Treatment |n=346) Control (n=295| 

Source: Mathematica Teacher Background Survey administered in fall 2005 to all study teachers: Mathematica 

classroom observations conducted in spring 2006 . 

Note: Data pertain to teachers in all districts participating in the study. Data are weghted and regression 

adjusted to account for differences in baseline characteristics and the study design. Scoring scale: (1) 
no evidence, (2) limited evidence. (3) moderate evidence. (4) consistent evidence, or (5) extensive 
evidence of effective teaching practice. 
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Figure C.4. Impacts on Literacy Implementation Scores by District: One-Year Districts, 2005-2006 School 
Year 


1.5 - 



-1.5 - 

D* F A D G J H C B 

District 


Source: Mathematica Teacher Background Survey administered in fall 2005 to all study teachers: Mathematica 

classroom observations conducted in spring 2006 . 

Note: Data pertain to teachers in one-year districts participating in the study. Data are weighted and 

regression adjusted using ordinary least squares to account for differences in benchmark covariates 
and robust variance estimation to account for study design and the clustering of teachers within 
schools. Plot symbols represent the difference between regresson-adjusted treatment and control 
mean within each district, and the vertical lines show the 95 percent confidence interval around each 
point. District codes A through J are arbitrary. Districts are ordered according to the size of the impact. 
N=342 teachers. 

'Significantly different from zero at the 0.05 level. (No adjustment is applied for multipie comparisons.) 
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Figure C.5. Impacts on Literacy Implementation Scores by District: Two-Year Districts, 2005-2006 School 
Year 
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Source: Mathematica Teacher Background Survey administered in fall 2005 to all study teachers: Mathematics 

classroom observations conducted in spring 2006 . 

Note: Data pertain to teachers in two-year districts participating in the study. Data are weighted and 

regression adjusted using ordinary least squares to account for differences in benchmark covariates 
and robust variance estimation to account for study design and the clustering of teachers within 
schools. Plot symbols represent the difference between regresson-adjusted treatment and control 
mean within each district, and the vertical lines show the 95 percent confidence interval around each 
point. District codes K through O are arbitrary. Districts are ordered according to the size of the impact. 
N=289 teachers. 




None of the differences is statistically significant at the 0.05 level. 
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B. Supplementary Information for Student Achievement 

Tables C5-C.10 show trcatmcnl and control sample sizes for the results shown in Tables V.2— 
V.7. Figures C.6-C9 illustrate district-by-district effects for reading and math in onc-vcar districts 
(C.6 and C.7) and two-year districts (C.8 and C.9). 

Table C.5. Treatment and Control Sample Sizes for Impacts on Test Scores: One-Year Districts, 2007-2008 
School Year 


Sample Sizes: Treatment Group Sample Sizes: Control Group 


Subject: 

Students 

Teachers 

Schools 

Districts 

Students 

Teachers 

Schools 

Districts 

Reading 

776 


37 

8 

914 

55 

42 

8 

Math 

766 

44 

35 

8 

863 

51 

41 

8 


Source: Mathematica analysis using data from the 2006-2007 and 2007-2008 school years provrted by 

participating school districts. 


Table C.6. Treatment and Control Sample Sizes for Impacts on Test Scores by Grade: One-Year Districts, 
2007-2008 School Year 



Sample Sizes: Treatment Group 

Sample Sizes: Control Group 

Subject: Grade 

Students 

Districts 

Students 

Districts 

Reading 

Grade 2 

182 

4 

262 

4 

Grade 3 

200 

5 

221 

5 

Grade 4 

242 

7 

237 

7 

Grade 5 

152 

4 

194 

4 

All Grades 

776 

8 

914 

8 

Math 

Grade 2 

94 

2 

106 

2 

Grade 3 

199 

5 

223 

5 

Grade 4 

321 

8 

340 

8 

Grade 5 

152 

4 

194 

4 

All Grades 

766 

8 

863 

8 


Source: Mathematica analysis using data from the 2006-2007 and 2007-2008 school years provided by 

participating school districts. 
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Table C.7. Treatment and Control Sample Sizes for Impacts on Test Scores. Alternate Model Specifications: 
One-Year Districts. 2007-2008 School Year 


Sample Sizes: Treatment Group Sample Sizes: Control Group 


Subject'Model 

Students 

Teac tiers 

Sc (tools 

Districts 

Students 

Teac tiers 

Schools 

Districts 

Readng 









1. Benchmark 

776 

44 

37 

8 

914 

55 

42 

8 

2. No teacher 
covariates 

776 

44 

37 

8 

914 

55 

42 

8 

3. No teacher or 
student covariates 

776 

44 

37 

8 

914 

55 

42 

8 

4. Schools weighted 
equaly 

776 

44 

37 

8 

914 

55 

42 

8 

5. Districts weighted 
equaly 

776 

44 

37 

8 

914 

55 

42 

8 

6. Using specific 
inlormation on teacher 

721 

41 

35 

8 

819 

48 

36 

8 

assignments 

7. Without imposing 

874 

48 

39 

8 

1.018 

59 

43 

8 

data restrictions 









8. No pretest 
benchmark sample 

776 

44 

37 

8 

914 

55 

42 

8 

9. No pretest 
expanded sample 

1.383 

70 

50 

8 

1.473 

81 

57 

8 

10. Compare teachers 
within districts, not 

921 

52 

41 

8 

1.052 

62 

49 

8 

district-grades 

11. Instrumental 
variables 

708 

42 

36 

8 

811 

51 

41 

8 

Math 









1. Benchmark 

766 

44 

35 

8 

863 

51 

41 

8 

2. No teacher 
covariates 

766 

44 

35 

8 

863 

51 

41 

8 

3. No teacher or 
student covariates 

766 

44 

35 

8 

863 

51 

41 

8 

4. Schools weighted 
equaly 

766 

44 

35 

8 

863 

51 

41 

8 

5. Districts weighted 
equaly 

766 

44 

35 

8 

863 

51 

41 

8 

6. Using specific 
inlormation on teacher 

746 

43 

34 

8 

741 

44 

35 

8 

assignments 

7. Without imposing 
data restrictions 

309 

45 

36 

8 

891 

52 

41 

8 

8. No pretest 
benchmark sample 

766 

44 

35 

8 

863 

51 

41 

8 

9. No pretest 

expanded sample 

1.297 

65 

46 

8 

1.405 

73 

54 

8 

10 Compare teachers 
within districts, not 

803 

46 

37 

8 

1.001 

58 

48 

8 

district-grades 

11. Instrumental 

730 

43 

35 

8 

824 

50 

40 

8 

variables 










Source: Mathematics analysis using data (torn the 20D6-2C07 and 2007-2003 school years provided by participating school 

districts. 
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Table C.8. Treatment and Control Sample Sizes for Impacts on Test Scores: Two-Year Districts, 2007-2008 
School Year 



Sample Sizes: Treatment Group 


Sample Sizes: Control Group 

Subject 

Students 

Teachers 

Schools 

Districts 

Students 

Teachers 

Schools Districts 

Reading 

807 

43 

30 

6 

540 

31 

26 6 

Math 

699 

39 

27 

6 

499 

29 

25 6 

Source: Mathematica analysis using < 

data from 

the 2006-2007 

and 2007-2008 school 

years provided by 

participating school districts. 






Table C.9. Treatment and Control Sample Sizes for Impacts on Test Scores by Grade: Two-Year Districts, 

2007-2008 School Year 









Sample Sizes: Treatment Group 


Sample Sizes: Control Group 

Subject'Grade 


Students 


Districts 


Students 

Districts 

Reading 








Grade 2 


51 


1 


12 

1 

Grade 3 


173 


2 


88 

2 

Grade 4 


379 


5 


362 

5 

Grade 5 


192 


2 


54 

2 

Grade 6 


12 


1 


24 

1 

All Grades 


807 


6 


540 

6 

Math 








Grade 2 


51 


1 


12 

1 

Grade 3 


208 


2 


71 

2 

Grade 4 


274 


5 


340 

5 

Grade 5 


154 


2 


52 

2 

Grade 6 


12 


1 


24 

1 

All Grades 


699 


6 


499 

6 


Source: Mathemabca analysis using data from the 2006-2007 and 2007-2008 school years provided by 

participating school districts. 
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Table C.10. Treatment and Control Sample Sizes for Impacts on Test Scores, Alternate Model Specifications: 
Two-Year Districts, 2007-2008 School Year 


Sample Sizes: Treatment Group Sample Sizes. Control Group 


Subject'Model 

Students 

Teachers 

Schools 

Districts 

Students 

Teachers 

Schorts 

Districts 

Reading 









1. Benchmark 

807 

43 

30 

6 

540 

31 

26 

6 

2. No teacher 
covariates 

807 

43 

30 

6 

540 

31 

26 

6 

3. No teacher or 
student covariates 

607 

43 

30 

6 

540 

31 

26 

6 

4. Schools weighted 
egualy 

607 

43 

30 

6 

540 

31 

26 

6 

5. Districts weighted 

807 

43 

30 

6 

540 

31 

26 

6 

egualy 

6. Usmg specific 
information cn teacher 
assignments 

807 

43 

30 

6 

540 

31 

26 

6 

7. Without imposing 
data restrictions 

807 

43 

30 

6 

540 

31 

26 

6 

8. No pretest, 
benchmark sample 

807 

43 

30 

6 

540 

31 

26 

6 

9. No pretest, 
expanded sample 

1,464 

73 

40 

7 

993 

54 

38 

7 

10. Compare teachers 
within districts, not 
district-grades 

9D8 

49 

34 

6 

576 

33 

27 

6 

11. Instrumental 
variables 

805 

42 

29 

6 

533 

31 

26 

6 

Math 









1 . Benchmark 

699 

39 

27 

6 

499 

29 

25 

6 

2. No teacher 
covariates 

699 

39 

27 

6 

499 

29 

25 

6 

3. No teacher or 
student covariates 

699 

39 

27 

6 

499 

29 

25 

6 

4. Schools weighted 
equaly 

699 

39 

27 

6 

499 

29 

25 

6 

5. Districts weighted 

699 

39 

27 

6 

499 

29 

25 

6 

equaly 

6. Using specific 
information cn teacher 

662 

37 

26 

6 

499 

29 

25 

6 

assignments 

7. Without imposing 
data restrictions 

730 

40 

28 

6 

529 

30 

26 

6 

8. No pretest, 
benchma’k sample 

699 

39 

27 

6 

499 

29 

25 

6 

9. No pretest, 
expanded sample 

1.290 

6B 

39 

7 

934 

52 

38 

7 

10. Compare teachers 
within districts, not 
district-grades 

600 

45 

31 

6 

598 

32 

27 

6 

11. Instrumental 
variables 

697 

39 

27 

6 

496 

29 

25 

6 


Source: Mathematics analysis using data from the 20C6-2007 and 2007-2008 school years provided by participating school 

districts. 
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Figure C.6. Impacts on Reading Test Scores by District: One-Year Districts. 2007-2008 
School Year 
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Source: Mathematica analysis using data from the 2006-2007 and 2007-2008 school years provided by 

participating school districts. 

Note: Data pertain to teachers in one-year districts participating in the study. Data are weighted and 

regression adjusted using ordinary least squares to account tor differences in benchmark covariates 
and robust variance estimation to account for study design and the clustering of teachers within 
schools. Plot symbols represent the difference between regresson-adjusted treatment and control 
mean within each district, and the vertical lines show the 95 percent confidence interval around each 
point. District codes A through J are arbitrary. Districts are ordered according to the size of the impact. 
Impacts are expressed as a fraction of a standard deviation in scores, where the standard deviation Is 
based on al study students in the same grade and district. N=99 teachers and 1.690 students. 

’Significantly different from zero at the 0.05 level. (No adjustment is applied for multiple 

comparisons.) 
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Figure C.7. Impacts on Math Test Scores by District: One-Year Districts. 2007-2008 School Year 



D" I I 1 I J A. C fj U 


District 

Source: Mathematics analysis using data from the 2006-2007 and 2007-2008 school years provided by 

participating school districts. 

Note: Data pertain to teachers in one-year districts participating in the study. Data are weighted and 

regression adjusted using ordinary least squares to account tor differences in benchmark covariates 
and robust variance estimation to account for study design and the clustering ot teachers within 
schools. Plot symbols represent the difference between regression-adjusted treatment and control 
mean within each district, and the vertical lines show the 95 percent confidence interval around each 
point. District codes A through J are arbitrary. Districts are ordered according to the size of the impact. 
Impacts are expressed as a fraction of a standard deviation in scores, where the standard deviaton is 
based on aB study students in the same grade and district. N=95 teachers and 1.629 students. 

’Significantly different from zero at the 0.05 level. (No adjustment is applied for multiple 

comparisons.) 
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Figure C.8. Impacts on Reading Test Scores by District: Two-Year Districts. 2007-2008 School Year 
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Source: Walhematica analysis using data from the 2006-2007 and 2007-2008 school years provided by 

participating school districts. 

Note: Data pertain to teachers in two-year districts participating in the study. Data are weighted and 

regression adjusted using ordinary least squares to account tor differences in benchmark covariates 
and robust variance estimation to account for study design and the clustering of teachers within 
schools. Plot symbols represent the difference between regresson-adjusted treatment and control 
mean within each district, and the vertical lines show the 95 percent confidence interval around each 
point. District codes K through O are arbitrary. Districts are ordered according to the size of the impact. 
Impacts are expressed as a fraction of a standard deviation in scores, where the standard deviation is 
based on all study students in the same grade and district. N=74 teachers and 1 .347 students. 

’Significantly different from zero at the 0.05 level. (No adjustment is applied for multiple 
comparisons.) 
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Figure C.9. Impacts on Math Test Scores by District: Two-Year Districts, 2007-2008 School Year 
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Source: Mathematica analysis using data from the 2006-2007 and 2007-2008 school years provided by 

participating school districts. 

Note: Data pertain to teachers in two-year districts participating in the study. Data are weighted and 

regression adjusted using ordinary least squares to account tor differences in benchmark covariates 
and robust variance estimation to account for study design and the clustering of teachers within 
schools. Plot symbols represent the difference between regresson-adjusted treatment and control 
mean within each district, and the vertical lines show the 95 percent confidence interval around each 
point. District codes K through O are arbitrary. Districts are ordered according to the size of the impact. 
Impacts are expressed as a fraction of a standard deviation in scores, where the standard deviation is 
based on all study students in the same grade and district. N=68 teachers and 1.198 students. 

’Significantly different from zero at the 0.05 level. (No adjustment is applied for multiple 
comparisons.) 
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Table C.1 1. Changes in Impacts on Test Scores Over Time, Common Sample of Teachers: Two-Year Districts 


Subject 

Adjusted Mean 

Test Scores 


P-value 


Sample Sizes 


Treatment 

Control 

umerence 
(Effect Size) 

Students 

Teachers 

Districts 

Reading 








Year 1 

-0.14 

0.02 

-0.15 

0.18 

637 

37 

5 

Year 3 

-0.08 

-0.32 

0.24’ 

0.00 

676 

37 

5 

Year 3-Year 1 

0.05 

-0.34 

0.39' 

0.00 




Year 2 

-0.13 

-0.18 

0.04 

0.59 

947 

52 

6 

Year 3 

-0.16 

-0.20 

0.04 

0.38 

947 

52 

6 

Year 3-Year 2 

-0.02 

-0.02 

0.00 

0.99 




Math 








Year 1 

-0.04 

0.09 

-0.13 

0.32 

567 

37 

5 

Year 3 

-0.04 

-0.13 

0.10 

0.22 

684 

37 

5 

Year 3-Year 1 

0.00 

-0.22 

0.22 

0.14 




Year 2 

-0.21 

-0.22 

0.01 

0.88 

902 

52 

6 

Year 3 

-0.09 

-0.19 

0.10 

0.18 

961 

52 

6 

Year 3-Year 2 

0.11 

0.02 

0.09 

0.38 





Source: Mathematica analysis using data from the 2004-2005, 2005-1006, 2006-2007 and 2007-2008 school 

years provided by participating school districts; Mathematica Teacher Background Survey administered 
in fall 2005 to all study teachers. 

Notes: Data pertain to teachers in two-year districts participating in the study. Data are regression adjusted to 

account for pretest, student and teacher characteristics, district-by-grade fixed effects, and clustering of 
students within schools. The common sample is the subsample of teachers who had students with valid 
test score data in either Year 1 and Year 3 or Year 2 and Year 3. 

•Significantly different from zero at the 0.05 level. 
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Table C.12. Impacts on Intermediate Outcomes, Reading Test Score Sample 


Outcome 

Treatment 

Control 

Difference 

P-value 

N 

Any mentor assigned (percentage) 

Year 1 . fall 

93.96 

73.85 

20.1 

0.030 

* 66 

Year 1, spring 

98.37 

75.20 

23.2 

0.007 

* 67 

Year 2. fall 

84.52 

31.14 

53.4 

0.000 

• 63 

Year 2. spring 

92.20 

39.57 

52.6 

0.000 

* 64 

Total time spent meeting with mentors 
(minutes per week) 

Year 1 . fall 

63.69 

73.07 

-9.4 

0.783 

66 

Year 1, spring 

62.00 

74.11 

-12.1 

0.851 

67 

Year 2. fall 

29.32 

42.65 

-13.3 

0.852 

63 

Year 2. spring 

68.56 

40.98 

27.6 

0.556 

65 

Suggesbons to improve practice 
(percentage) 

Year 1 . fall 

80.48 

57.21 

23.3 

0.030 

* 66 

Year 1, spring 

88.99 

52.25 

36.7 

0.001 

* 67 

Year 2. fall 

55.32 

25.14 

30.2 

0.013 

* 57 

Year 2. spring 

54.94 

27.35 

27.6 

0.016 

* 65 

Guidance in teaching reading 
(percentage) 

Year 1, fall 

56.58 

43.32 

13.3 

0.343 

66 

Year 1. spring 

80.09 

34.19 

45.9 

0.001 

• 66 

Year 2. fall (question not asked) 

Year 2. spring 

31.68 

19.89 

11.8 

0.325 

65 

Given feedback on teaching, not as part 
of formal evaluation (number of times) 

Year 1 . fall 

2.67 

2.42 

0.2 

0.648 

65 

Year 1, spring 

3.03 

1.96 

1.1 

0.025 

* 66 

Year 2. fall 

1.35 

1.52 

-0.2 

0.727 

62 

Year 2. spring 

2.71 

1.53 

1.2 

0.002 

• 65 

Classroom practices in year 1 

Literacy lesson content 

2.34 

2.38 

-0.0 

0.831 

56 

Literacy lesson implementation 

2.74 

2.65 

0.1 

0.705 

56 

Classroom culture 

3.27 

3.04 

0.2 

0.199 

56 


Source: Mathematica analysis using data from the 2007-2008 school year provided by participating school 

districts: Mathematica Teacher Background Survey administered in faB 2005 and Induction Actrvities 
Surveys administered in fall 2005. spring 2006. fal 2006. and spring 2007. 

Notes: Data pertain to teachers in two-year districts participating in the study who were also in the benchmark 

reading test score analysis. Classroom practice data are weighted and regression adjusted using 
ordinary least squares to account for differences in baseline characteristics and the study design. 

"Significantly different from zero at the 0.05 level. 
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Table C.13. Impacts on Intermediate Outcomes. Math Test Score Sample 


Outcome 

Treatment 

Control 

Difference 

P-value 

N 

Any mentor assigned (percentage) 

Year 1 . fall 

92.11 

73.85 

18.3 

0.053 

59 

Year 1 . spring 

96.24 

75.20 

21.0 

0.016 

62 

Year 2. fall 

86.55 

31.14 

55.4 

0.000 

56 

Year 2. spring 

89.50 

39.57 

49.9 

0.000 

56 

Total time spent meeting with mentors 
(minutes per week) 

Year 1 . fall 

34.92 

73.07 

-38.2 

0.345 

59 

Year 1 . spring 

71.29 

74.11 

-2.8 

0.961 

62 

Year 2. fall 

26.62 

42.65 

-16.0 

0.839 

56 

Year 2. spring 

72.04 

40.98 

31.1 

0.506 

57 

Suggestions to improve practice 
(percentage) 

Year 1 . fall 

78.82 

57.21 

21.6 

0.050 

59 

Year 1. spring 

86.44 

52.25 

34.2 

0.002 

62 

Year 2. fall 

57.31 

25.14 

32.2 

0.015 

49 

Year 2. spring 

49.52 

27.35 

22.2 

0.065 

57 

Guidance in teaching math 
(percentage) 

Year 1 . fall 

37.21 

34.74 

2.5 

0.854 

59 

Year 1. spring 

59.36 

31.76 

27.6 

0.073 

62 

Year 2. fall (question not asked) 

Year 2. spring 

35.88 

16.92 

19.0 

0.126 

57 

Given feedback on teaching, not as part 
of formal evaluation (number of times) 

Year 1 . fall 

2.96 

2.42 

0.5 

0.250 

58 

Year 1 . spring 

2.63 

1.96 

0.7 

0.201 

62 

Year 2. fall 

1.25 

1.52 

-0.3 

0.606 

55 

Year 2. spring 

2.40 

1.53 

0.9 

0.012 

57 


Source: Mathematics analysis using data from the 2007-2008 school year provided by participating school 

districts: Mathematics Teacher Background Survey administered in fall 2005: Mathematics Induction 
Activities Surveys administered in fall 2005. spring 2006. fal 2006. and spring 2007. 

Notes: Data pertain to teachers in tyro-year districts participating in the study who were also m the benchmark 

math test score analysis. 

•Significantly different from zero at the 0.05 level. 
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APPENDIX D 

SENSITIVITY ANALYSES AND SUPPLEMENTAL INFORMATION 

FOR CHAPTER VI 

Chapter VI presented estimates of the impacts of comprehensive induction on workforce 
outcomes (teacher attitudes and teacher mobility). This appendix presents supplemental results 
related to those findings. 

A. Supplementary Information and Sensitivity Analysis for Teacher Satisfaction 

Table D.l presents results for teacher satisfaction for one-year districts in fall 2005, spring 2006, 
fall 2006, fall 2007 and fall 2008." Table D.2 presents results for teacher satisfaction for two-year 
districts in fall 2005, spring 2006, fall 2006, spring 2007, fall 2007, anti fall 2008. These results arc 
summarized in Chapter VI, Figures VI. I and VI. 2. 

One concern with the analysis of the teacher attitudes tlata is that the summary scores may 
mask impacts for individual items that make up the three summary scores within each domain. 
Another concern is that self-reported attitude measures rely on scales that may not have equal 
intervals. For example, the difference between the first and second categories may be larger than 
those between the third and fourth. We recoded teacher satisfaction into two categories: (1) “very 
dissatisfied" or "somewhat dissatisfied” or (2) "somewhat satisfied” or “very satisfied.” We then 
examined item-specific impacts on the outcome defined by these dichotomous variables. Of the 
38 differences examined among teachers in one-year districts in fall 2007 and fall 2008 shown in 
Table D.3, one was statistically significant. Treatment teachers were significantly less likely than 
control teachers to report satisfaction with salary anti benefits in fall 2007. Of the 38 diftcrcnces 
examined among teachers in two-year districts in fall 2007 and fall 2008 shown in Table D.4, one 
was statistically significant. Treatment teachers were significantly more likely than control teachers to 
report satisfaction with school facilities (buildings and grounds) in fall 2007. M 


Teacher satisfaction was not measured in one-year districts in spring 200". 

M The item specific impacts for fall 2005, spring 2006, fall 2006, and spring 2007 ran be found in Isenbtrg et al. 
<2<XK>). 
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Table D.l. Impacts on Teacher Satisfaction (Scores on a Four-Point Scale): One-Year Districts 



T reatment 

Control 

Difference 

P-value 

Sample Size 
(Teachers) 

Fall 2005 

Feel Satisfied with: 

School 

3.1 

3.1 

0.0 

0.751 

498 

Class 

3.0 

3.0 

0.1 

0.339 

498 

Teaching career 

3.0 

3.0 

-0.1 

0.290 

498 

Spring 2006 

Feel Satisfied with: 

School 

3.0 

3.0 

0.0 

0.927 

492 

Class 

3.0 

3.0 

0.0 

0.720 

492 

Teaching career 

2.9 

2.9 

-0.1 

0.201 

492 

Fal 2006 

Feel Satisfied with: 

School 

3.2 

3.1 

0.0 

0.843 

472 

Class 

3.1 

3.1 

0.0 

0.812 

472 

Teaching career 

3.0 

3.0 

0.0 

0.615 

472 

Fal 2007 

Feel Satisfied with: 

School 

3.1 

3.1 

0.0 

0.944 

424 

Class 

3.2 

3.1 

0.1 

0.155 

425 

Teaching career 

2.9 

2.9 

0.0 

0.701 

426 

Fal 2008 

Feel Satisfied with: 

School 

3.1 

3.1 

0.0 

0.786 

397 

Class 

3.1 

3.2 

0.0 

0.609 

397 

Teaching career 

2.9 

2.9 

0.0 

0.910 

396 


Source: Malhematca First. Second. Third. Fifth, and Sixth Induction Activities Surveys administered in fal 2005. 

spring 2006. fall 2006. fall 2007. and fall 2008 to all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Data are weighted and 

regression adjusted to account for differences in districts, teacher grade assignments, study design, and 
the clustering of teachers within schools. Satisfaction scale: (1) very dissatisfied. (2) somewhat 
dissatisfied. (3) somewhat satisfied, or (4) very satisfied. Sample sizes vary due to item nonresponse. 

None of the differences is statistically significant at the 0.05 level. 
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Table D.2. Impacts on Teacher Satisfaction (Scores on a Four-Point Scale): Two-Year Districts 



Treatment 

Control 

Difference 

P-value 

Sample Size 
(Teachers) 

Fall 2005 

Feel Satisfied with: 

School 

3.1 

3.1 

0.0 

0.908 

391 

Class 

3.1 

3.1 

0.0 

0.895 

391 

Teaching career 

3.0 

3.1 

-0.1 

0.127 

391 

Spring 2006 

Feel Satisfied with: 

School 

3.1 

3.0 

0.0 

0.596 

384 

Class 

3.0 

3.0 

0.0 

0.977 

384 

Teaching career 

2.9 

3.0 

-0.1 

0.286 

384 

Fal 2006 

Feel Satisfied with: 

School 

3.1 

3.2 

0.0 

0.793 

360 

Class 

3.2 

3.1 

0.1 

0.280 

360 

Teaching career 

3.0 

3.0 

0.0 

0.999 

359 

Spring 2007 

Feel Satisfied with: 

School 

3.0 

2.9 

0.1 

0.207 

370 

Class 

3.1 

3.1 

0.0 

0.797 

370 

Teaching career 

2.8 

2.9 

-0.1 

0.214 

370 

Fall 2007 

Feel Satisfied with: 

School 

3.1 

3.0 

0.1 

0.339 

321 

Class 

3.1 

3.1 

0.0 

0.912 

324 

Teaching career 

2.9 

2.9 

0.0 

0.856 

326 

Fall 2008 

Feel Satisfied with: 

School 

3.1 

3.1 

0.0 

0.874 

319 

Class 

3.1 

3.1 

0.0 

0.897 

318 

Teaching career 

2.8 

2.8 

0.0 

0.852 

319 


Source: Mathematics First. Second. Third. Fifth, and Sixth Induction Activities Surveys administered in fal 2005. 

spring 2006. fall 2006. fall 2007. and fall 2008 to all study teachers and Fourth Induction Activities 
Survey administered in spring 2007 to study teachers in two-year districts. 

Note: Data pertain to teachers in two-year districts participating in the study. Data are weighted and 

regression adjusted to account for differences in districts, teacher grade assignments, study design, and 
the clustering of teachers within schools. Satisfaction scale: (1) very dissatisfied. (2) somewhat 
dissatisfied. (3) somewhat satisfied, or (4) very satisfied. Sample sizes vary due to item nonresponse. 

None of the differences is statistically significant at the 0.05 level. 
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Table D.3. Impacts on Teacher Satisfaction (Percentage “Somewhat Satisfied" or "Very Satisfied’’): One-Year Districts, 
Fall 2007 and Fall 2008 





Fall 2007 





Fall 2008 



Area of Satisfaction 

Treatment 

Contrct 

Difference 

Effect Size 

P-value 

Treatment 

Control 

Dilfetence 

Effect Size 

P-value 

Satisfaction with School 

Administration support for beginning 

69.3 

73.4 

-4.1 

-0.09 

0.261 

64.6 

66.5 

-1.9 

-0.04 

0.600 

teachers 

Availability of resources and 

67.9 

66.7 

1.3 

0.03 

0.750 

59.7 

60.4 

•0.7 

•0.01 

0.868 

materiats/equipment for your 
classroom 

Input into school policies and 

67.1 

68.8 

-1.7 

-0.04 

0.642 

60.3 

63.9 

-3.6 

-0.08 

0.355 

practices 

Opportunities for professional 

75.5 

74.3 

1.3 

0.03 

0.720 

68.9 

73.5 

-4.6 

•0.10 

0.186 

development 

Principal's leaderahp and vision 

68 .6 

65.4 

3.2 

0.07 

0.396 

63.9 

59.1 

4.8 

0.10 

0.210 

Professional caliber of colleagues 

70 2 

71.7 

-1.5 

-0.03 

0.678 

65.9 

67.8 

-2.0 

-0.04 

0.581 

Supportive atmosphere amcng 

73.0 

71.3 

1.7 

0.04 

0.609 

70.5 

67.4 

3.1 

0.07 

0.370 

faciity/collatoratioo with colleagues 
School facilities such as the building 

66.8 

62.0 

4.7 

0.10 

0.275 

60.5 

63.9 

-3.4 

•0.07 

0.375 

or grounds 

School policies 

71.6 

74.3 

-2.7 

-0.06 

0.415 

67.8 

66.1 

1.7 

0.04 

0.642 

Satisfaction with Students 

Autonomy or control over own 

77.9 

74.7 

3.3 

0.08 

0.279 

72.3 

72.6 

-0.3 

•0.01 

0.943 

classroom 

Student motivation to team 

65.6 

59.5 

6.1 

0.13 

0.115 

61.0 

64 8 

-3.7 

-0.08 

0.330 

Student discipline and behavior 

64.9 

58.7 

6.3 

0.13 

0.116 

56.0 

60.0 

-4.0 

-0.08 

0.352 

Parental involvement h the school 

46.8 

39.7 

7.1 

0.14 

0.142 

45.7 

44.3 

1.3 

0.03 

0.766 

Grade assignment 

84.4 

82.3 

2.2 

0.06 

0.308 

78.2 

78.7 

-0.5 

•0.01 

0.833 

Students assigned 

78.1 

79.3 

-1.2 

•0.03 

0.686 

71.3 

75.7 

-4.4 

•0.10 

0.157 

Satisfaction with Teaching Career 
Salary and benefits 

54.2 

64.6 

-10.4' 

•0.21 

0.009 

55.3 

57.0 

-1.7 

•0.03 

0.676 

Professional prestige 

68.9 

65.4 

3.5 

O.OB 

0.334 

64.4 

622 

2.3 

0.05 

0.559 

Intelectual challenge 

78.5 

78.5 

0.0 

0.00 

0.998 

72.4 

75.7 

-3.2 

•0.07 

0.331 

Workload 

46.8 

46.8 

0.0 

0.00 

0.997 

45.4 

426 

2.8 

0.06 

0.520 

Sample Size (Teachers) 

219 

207 

426 



206 

192 

398 




Source. Mathematics Fifth and Sixth Induction Activities Surveys admnislered n lal 2007 and tall 2003 to al study teachers. 

Mate Data pertain to teachers in one-year districts participating in the study Data are wilted and regressOn adiusted to account for differences m districts, teacher grade 

assignments, study design, and the clustering of teachers within schools. Sampe sizes vary due to item nonresponse. 


'Significantly different from zero at the 0.05 level. 








s-ci 


Table D.4. Impacts on Teacher Satisfaction (Percentage "Somewhat Satisfied" or “Very Satisfied"): Two-Year Districts, 
Fall 2007 and Fall 2008 


Area of Satisfaction 


Fai 2007 Fall 200B 


Treatment Control Difference Effect Size P-vaiue Treatment Contrct Difference Effect Size P-value 


Satisfaction with School 


Administration support for 

70.7 

73.2 

-2.5 

-0.05 

>.563 

67.5 

70.0 

-2.5 

-0.05 

0.512 

beginning teachers 











Availability of resources and 

69.9 

63.4 

6.5 

0.14 

>.165 

64.5 

61.8 

2.8 

0.06 

0.509 

materiatslequiprnent for your 











classroom 











Input into school policies and 

68.3 

59.8 

8.5 

0.18 

>.102 

58 5 

62.9 

-4.4 

-0.09 

0.371 

practices 











Opportunities for professional 

77.4 

71.3 

6.1 

0.14 

>.115 

698 

682 

1.6 

0.03 

0.685 

development 











Principal's leadership and vision 

70.8 

67.7 

3.1 

0.07 

>467 

65.3 

65.3 

0.0 

0.00 

0.995 

Professional caliber of colleagues 

71.1 

75.6 

-4.6 

•0.10 

>256 

652 

67.7 

•2.5 

-0.05 

0 596 

Supportive atmosphere among 

74.5 

732 

1.4 

0.03 

>.748 

64.5 

67.7 

-3.1 

-0.07 

0 539 

faciity/collatoration with 











colleagues 











Schod facilities such as tine 

69.5 

55.5 

14.0‘ 

0.29 

>.004 

62.5 

58.8 

3.7 

008 

0.474 

building or pounds 











School policies 

69.8 

74.4 

-4.5 

•0.10 

>.295 

69.0 

67.7 

1.3 

0.03 

0.753 

Satisfaction with Students 











Autonomy or control over own 

77.3 

768 

0.4 

0.01 

>.901 

698 

75.3 

-5.5 

•0.12 

0.190 

classroom 











Student motivation to learn 

61.3 

cn 

65.2 

fin a 

-3.9 

.A 1 

-0.08 

n nn 

>.393 

61.5 

co o 

63.5 

Cfi O 

-2.1 

1 

-0.04 

A fYl 

0.668 

n qqh 

ovuuck uiVwipiic ana ueiid (iui 

Parental involvement in tine school 

W.J 

45.5 

45.7 

U. 1 

•0.3 

u.w 

•0.01 

>.957 

30^ 

47.8 

JO.i 

47.7 

■V. 1 

02 

0.00 

0.975 

Grade assignment 

79.4 

81.1 

-1.7 

-0.04 

>.614 

75.7 

76.5 

•0.8 

-0.02 

0 805 

Students assigned 

732 

77.4 

-42 

-0.10 

>274 

67.6 

71.8 

-42 

-0.09 

0.317 

Satisfaction wnth Teaching Career 











Salary and benefits 

61.1 

56.7 

4.4 

0.09 

>.349 

43.8 

46.5 

•2.6 

-0.05 

0.629 

Professional prestige 

72.4 

65.2 

7.2 

0.15 

>.094 

64.7 

59.4 

5.3 

0.11 

0.278 

Intetectual challenge 

78.8 

78.7 

0.1 

0.00 

>.977 

71.3 

72.3 

-1.1 

-0.02 

0.772 

Workload 

50.0 

47.6 

2.5 

0.05 

>.630 

44 4 

48.8 

-4.4 

-0.09 

0.372 

Sample Size (Teachers) 

179 

147 

326 



178 

143 

321 




Source: Mathematics Fifth and Sixth Induction Activities Surveys administered it fal 2007 and tall 2003 to al study teachers. 

Note Data pertain to teachers in two-year districts participating in live study. Data are weighted and regression adjusted to account for differences m districts, teacher grade 

assignments, study design, and the clustering of teachers within schools. Sample sizes vary due to item nonresponse 

■SgtvficanBy different from zero at the 0.05 level. 
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B. Supplementary Information and Sensitivity Analysis for Teacher 

Preparedness 

Tabic D.5 presents results for teacher preparedness for one-year districts in tall 2005, spring 
2006, and fall 2008."' Table D.6 presents results for teacher preparedness for two-year districts in 
fall 2005, spring 2006, spring 2007, anti fall 2008. ' These results arc also shown in Chapter VI, 
Figures VI.3 and VIA 

As described earlier, one concern with the analysis of the teacher attitudes data is that the 
summary scores may mask impacts for individual items that make up the three summary scores 
within each domain. Another concern is that sell-reported attitude measures rely on scales that may 
not have equal intervals. \Vc recoded teacher preparedness into two categories: (1) “not at all 
prepared" or “somewhat prepared" or (2) “well prepared” or “very well prepared.” \\ c then 
examined item-specific impacts on the outcomes defined by the dichotomous variables. Of the 
13 diftcrenccs among teachers examined in tall 2(108, none were statistically significant in one-year 
districts (Table D.7) and one was significant in two-year districts (Table D.8). Treatment teachers in 
two-year districts were significantly less likely than control teachers to report feeling prepared to 
work with other teachers to plan instruction." 1 ” 


Teacher preparedness u-as not measured in one-year districts in tall 2006, spring 2007, or fall 2(07. 

M Teacher preparedness was not measured in two-year districts in fall 2(07. 

,A See Chapter II for a discussion of multiple comparisons and false discoveries. 

16 The item-specific impacts for fall 2(05 and spring 2006 can be found in Glaxcrman et al. (2(08). 'The item- 
specific impacts for spring 2006 and spring 2(07 can be found in Isenberg et al. (2009). 
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Table D.5. Impacts on Teacher Preparedness (Scores on a Four-Point Scale): One-Year Districts 



Treatment 

Control 

Difference 

P-vakie 

Sample Size 
(Teachers) 

Fal 2005 






Feel Prepared to: 

Instruct 

2.8 

2.8 

-0.1 

0.179 

501 

Work with others 

2.9 

2.9 

0.0 

0.688 

501 

Work with students 

2.8 

2.7 

0.0 

0.786 

501 

Spring 2006 

Feel Prepared to: 

Instruct 

2.9 

3.0 

-0.1 

0.088 

493 

Work with others 

2.9 

3.0 

-0.1 

0.067 

493 

Work with students 

2.8 

2.8 

0.0 

0.511 

493 

Fal 2008 






Feel Prepared to: 

Instruct 

3.4 

3.4 

0.0 

0.619 

386 

Work with others 

3.4 

3.3 

0.1 

0.281 

386 

Work with students 

3.2 

3.2 

0.0 

0.923 

386 


Source: Mathematica First. Second, and Sixth Induction Activities Surveys administered in fall 2005. spring 

2006 . and fall 2008 to all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Data are weighted and 

regression adjusted to account for differences in districts, teacher grade assignments, study design, and 
the clustering of teachers within schools. Satisfaction scale: (1) very dissatisfied. (2) somewhat 
dissatisfied. (3) somewhat satisfied, or (4) very satisfied. Sample sizes vary due to item nonresponse. 

None of the differences is statistically significant at the 0.05 level. 
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Table D.6. Impacts on Teacher Preparedness (Scores on a Four-Point Scale): Two-Year Districts 


Sample Size 



Treatment 

Control 

Difference 

P-vaiue 

(Teachers) 

Fal 2005 






Feel Prepared to: 

Instruct 

2.8 

2.9 

-0.1* 

0.030 

394 

Work with others 

2.8 

3.0 

-0.1 

0.178 

394 

Work with students 

2.7 

2.8 

-0.1 

0.219 

394 

Spring 2006 

Feel Prepared to: 

Instruct 

3.0 

3.0 

0.0 

0.703 

383 

Work with others 

3.0 

3.0 

-0.1 

0.338 

381 

Work with students 

2.9 

2.8 

0.1 

0.472 

383 

Spnng 2007 

Feel Prepared to: 






Instruct 

3.2 

3.1 

0.0 

0.869 

371 

Work with others 

3.1 

3.1 

0.0 

0.933 

371 

Work with students 

3.0 

3.0 

0.0 

0.614 

371 

Fall 2008 






Feel Prepared to: 

Instruct 

3.5 

3.5 

0.0 

0.891 

308 

Work with others 

3.4 

3.5 

-0.1 

0.255 

308 

Work with students 

3.3 

3.3 

0.0 

0.824 

308 


Source: Mathematica First. Second, and Sixth Induction Activities Surveys administered in fall 2005. spring 

2006. and fall 2008 to al study teachers and Fourth Induction Activities Survey administered in spring 
2007 to study teachers in two-year districts. 

Note: Data pertain to teachers in two-year districts participating in the study. Data are weighted and 

regression adjusted to account for differences in districts, teacher grade assignments, study design, and 
the clustering of teachers within schools. Satisfaction scale: (1) very dissatisfied. (2) somewhat 
dissatisfied. (3) somewhat satisfied, or (4) very satisfied. Sample sizes vary due to item nonresponse. 

"Significantly different from zero at the 0.05 level. 
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Table D.7. Impacts on Teacher Preparedness (Percentage "Somewhat Prepared" or "Very Prepared 1 ’): One- 
Year Districts. Fall 2008 





Fad 2008 



Area ot Preparedness 

Treatment 

Control 

Difference 

Effect Size 

P-value 

Prepared to Instruct 

Managing classroom activities, transitions, 
and routines 

76. t 

78.3 

-2.2 

-0.05 

0.457 

Usavg variety ot Instructional methods 

73.5 

74.3 

-0.8 

-0.02 

0.783 

Assessing your students 

74.1 

74.8 

-0.6 

-0.01 

0.838 

Setectoig and adapting instructional 
materials 

72.6 

72.6 

0.0 

0.00 

0.992 

Planning effective lessons 

75.5 

78.3 

-2.7 

-0.07 

0.299 

Be«ng an effective teacher 

74.9 

78.3 

-3.3 

-0.08 

0.246 

Addressing needs ot a diversity of learners 

67.8 

73.9 

-6.2 

-0.14 

0.070 

Prepared to Work i»th Students 

Handling range ot classroom behavior or 
drscapfane situations 

74.0 

73.9 

0.1 

0.00 

0.983 

Motivating students 

73.5 

73.0 

0.4 

0.01 

0.888 

Workmg effectively wth parents 

68.9 

71.3 

-2.4 

-0.05 

0.465 

Workmg with students with special 
challenges 

48.8 

54.8 

-6.0 

-0.12 

0.194 

Prepared to Work with Other School Staff 

Working with other teachers to plan 
ns true ton 

72.3 

74.8 

-2.5 

-0.06 

0.448 

Working with the principal or other 
instructional leaders 

72.7 

69.6 

3.1 

0.07 

0.340 

Sample Size (Teachers) 

206 

192 

398 




Source: Mathemalica Sixth Induction Activities Survey administered In (all 2008 to al study teachers. 

Note: Data pertain to teachers n one- year districts participating at the study. Data are weighted and regression 

adjusted to account (or differences In districts, teacher grade assignments, study design, and the dustenng ot 
teachers within schoMs. Sample sizes vary due to item nonresponse. 

None o( the differences is statistically slgnltcant at the 0.05 level. 
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Table D.8. Impacts on Teacher Preparedness (Percentage "Somewhat Prepared" or “Very Prepared"): Two- 
Year Districts. Fall 2008 





Fall 2008 



Area of Preparedness 

Treatment 

Control 

Difference 

Effect 

Size 

P- value 

Prepared to Instruct 

Managing classroom activities, transitions, 
and routines 

75.8 

80.1 

-4.4 

-0.11 

0.260 

Us«tg variety of Instructional methods 

74.3 

74.8 

-0.6 

-0.01 

0.852 

Assessatg your students 

75.4 

78.9 

-3.6 

-0.09 

0.283 

Sefectng and adapting instructional materials 

72.0 

74.3 

-2.3 

-0.05 

0.588 

Planning effective lessons 

76.3 

79.5 

-3.3 

-0.08 

0.322 

Being an effective teacher 

74.6 

80.1 

-5.6 

-0.13 

0.112 

Addressing needs of a diversity of learners 

72.0 

74.3 

-2.3 

-0.05 

0.541 

Prepared to Work with Students 

Handling range of classroom behavior or 
disopane situations 

73.0 

71.3 

1.7 

0.04 

0.690 

Motivating students 

72.8 

74.8 

-2.1 

-0.05 

0.584 

Working effectively with parents 

68.9 

76.0 

-7.1 

-0.16 

0.055 

Working with students with special 
challenges 

48.9 

52.6 

-3.7 

-0.07 

0.434 

Prepared to Work w.th Other School Staff 

Working with other teachers to plan 
■istroction 

68.7 

78.9 

-10.3' 

-0.23 

0.015 

Working with the principal or other 
instructional leaders 

69.6 

73.7 

-4.1 

-0.09 

0.354 

Sample Size (Teachers) 

178 

143 

321 




Source: Mathematics Sixth Induction Activities Survey administered In fall 2008 to al study teachers. 

Note Data certain to teachers in two-year districts participating in the study. Data are weighted and regression 

adjusted to account for differences In districts, teacher grade assignments, study design, and the dustenng of 
teachers within schoWs. Sample sizes vary due to item nonresponse. 

•Significantly different from zero at the 0.05 level. 
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C. Sensitivity and Supplemental Analysis for Teacher Retention 

1. Detail on Retention Measures 


Chapter VI presented figures that summarized the retention rates for treatment and control 
teachers (Figures VI. 7 and VI.8). Here we include tables with the numbers underlying those results 
for the final year of followup (Tables D.9 anti 17.10). The tables present additional detail on 
retention rates, along with sample sizes, to complement the findings shown in Figures VI. 7 and 
VI.8. They include the regression-adjusted percentages of teachers who were retained in the same 
school for one-year districts (53 percent) and two-vear districts (51 percent). The corresponding 
numbers for years 1 and 2 of the study arc presented in earlier reports (sec Glazcrman ct al. 2008; 
Iscnbcrg ct al. 2009). 


Table D.9. Impacts on Teacher Retention Rates After Three Years (Percentages): One-Year Districts 


Outcome 

All 

T eachers 

Treatment 

Control 

Difference 

P-value 

Retained in the same school 

53.3 

53.9 

52.7 

1.2 

0.804 

Retained in the same district 

69.3 

69.1 

69.6 

-0.5 

mm 

Retained in the teaching profession 

87.4 


86.3 

2.3 

SH 

Sample Size (Teachers) 

464 

237 

227 



Sample Size (Schools) 

224 

114 

110 




Source: Mathematics Teacher Background Survey administered in fall 2005 and Third Teacher Mobility Survey 

administered in fall 2008 to all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Data are regression adjusted 

using a logit model with robust standard errors to account for baseline characteristics and clustering of 
teachers within schools. 

None of the differences is statistically significant at the 0.05 level. 


Table D.10. Impacts on Teacher Retention Rates After Three Years (Percentages): Two-Year Districts 


Outcome 

All 

Teachers 

Treatment 

Control 

Difference 

P-value 

Retained in the same school 

50.9 

54.2 

47.1 

7.1 

0.159 

Retained in the same district 

63.0 

64.9 

60.9 

4.0 

0.388 

Retained in the teaching profession 

84.7 

84.4 

85.1 

-0.7 

0.850 

Sample Size (Teachers) 

375 

208 

167 



Sample Size (Schools) 

152 

82 

70 




Source: Mathematica Teacher Background Survey administered in fall 2005 and Third Teacher Mobility Survey 

administered in fall 2008 to all study teachers. 

Note: Data pertain to teachers in two-year districts participating in the study. Data are regression adjusted 

using a logit model with robust standard errors to account for baseline characteristics and clustering of 
teachers within schools. 

None of the differences is statistically significant at the 0.05 level. 
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Cumulative attrition of control group members Irom the teaching profession was 15 percent 
after three years, on a pace to reach 25 percent after five years if we were to make a linear 
extrapolation of five percentage points per year. I lowevcr, policy discussions about the problem of 
teacher turnover frequently cite attrition rates that arc twice as high. For example, Stans bury and 
Zimmerman (2000) noted that "a third of beginning teachers quit within their first three years.” The 
National Council on Teaching and America’s Future (2003) noted that "within three years, 
33 percent will leave and after five years... half of all new teachers will have exited the profession.” 
The Alliance for Excellent Education (2004) reported that “after just three years it is estimated that 
almost a third ol the new entrants to teaching have left the field, and after five years almost hall arc 
gone.” The American Association of State Colleges and Universities (2006) report, “nearly half of 
new teachers leave the classroom within the first five years." The source for these statements is the 
1991 Schools and Staffing Survey (SASS), conducted by the National Center for Education 
Statistics, vet the most authoritative analysis of those data indicated that 24 percent leave after two 
years and 46 percent leave after five years (Ingcrsoll 2002), rates that arc still higher than those 
obtained in the current study. 

Given the concern about lower estimates of prevailing teacher retention rates, we compared the 
retention analysis in this study to similar analyses published by other researchers using a variety of 
data sources. The comparisons arc meant to help clarify any discrepancies and place our study in 
perspective. Figure D. 1 presents the survival curves for the current study alongside those that other 
researchers have calculated in different settings and time points around the country’. 

The data sources include one set of estimates based on SASS, the national survey, which 
followed teachers of all experience levels from 1991 to 1992, and several others from administrative 
datasets in three states (California, Florida, and Georgia) plus one district (New York City). The 
administrative data included various cohorts of teachers who began their careers in the late 1990s 
and cariv 2(XW)s. 

In contrast, this study followed the 1,1X19 teachers from the start of their careers in 2005 in 
traditional public schools in urban districts meeting our size and poverty criteria. W e followed this 
single cohort of teachers into (what would be) the fourth year of their teaching career using surveys 
with response rates of at least 85 percent. W’c did not follow any teachers who were hired after the 
first week of school, because of concerns that the hiring of these teachers could be affected by 
treatment status. Nor did we follow temporary teachers, such as long-term substitutes. The retention 
rates for the study were approximately 95, 90, and 85 percent for the first three years, respectively. 

The study of teachers in Georgia in the late 1990s by Scafidi ct al. (2008) presented survival 
curves in which retention rates were about 85, 75, and 65 percent after each ol the teachers' first 
three years, lower than comparable numbers for the teacher induction study. The difference, 
however, could be due to time period, geographic area, or method ol measurement. In the Georgia 
study, figures were reported for women only and pertain to a period almost ten years prior to the 
current induction study. This study treated a teacher as a leaver if she left the state, even if she took a 
teaching position in a different state, or if she remained in state but was teaching in a private school. 
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Figure D.l. Teacher Survival Curves, Published Esbmates 


Percent 

Remaining 



•Induction itudy In 17 districts (2005 cohort) 
■California |1995cohort) 

•New York Oty (2000-2004 cohorts) 
■National (5AS5/TFS, 19S9-1991 cohorts) 
•Flor Ida 12001-2003 cohorts) 

•Georgia (1994-1999 cohorts) 


1 2 

Years After Hire 


Source: Mathematica First, Second, and Third Teacher Mobility Surveys administered in fall 2006, fall 2007. and 

fall 2008 to all study teachers: California Teaching Commission 2002; Boyd et al. 2008: Ingersoll 2002; 
West and Chingos 2009; Scafidi et al. 2008. 

A study of beyinning teachers in Florida (West and Chingos 2009) reported retention rates of 
82, 75, and 68 percent, all within 3 points of the corresponding Georgia estimates. Again, the 
measure of retention is from state administrative data and only includes retention in public schools 
in the state. Teachers who moved to private or charter schools arc treated as if they had left the 
profession. The Florida data pertain to teachers who began in 2001 through 2003. " 

A study of beginning teachers in New York City (Boyd et al. 2008) reported retention rates of 
9 1 , 80, and ”2 percent. They focused on the three cohorts that started teaching between 2000 and 
2004. IJkc the Florida study, the New York City study counts a teacher as being retained if she 
continues teaching public school in the state and as a leaver if she works in a private school, out of 
state school, or leaves teaching entirely. 

A study by California’s Commission on Teacher Credentialing reported rates of retention in the 
state’s public education system riant were 94, 90, and 87, closer to those reported in the current 
study. The California study, which used a state credentialing database anti a state employment 
database, tollowcd a cohort that began teaching 10 years before the induction study and measured 


He teacher* are only observed through 2lM»5, so the two-year retention estimate is based on the 21NII and 2002 
cohorts only and the three year retention estimate is based on the 2001 cohort only. 
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retention in public education in the slate, although California is a larger slate than Georgia or 
Florida. 

Finally, Ingcrsoll (2IH)2), the source frequently cited, is based on the SASS Jconductcd in 1990* 
1991), and the Teacher Followup Survey (TFS) which re-interviewed SASS leavers and a sample of 
SASS stayers one year later. Because it used survey methods instead of administrative data, the 
SASS/TFS analysis was able to count teachers who moved between sectors (publie/private) and 
between states as stayers, but it covers an earlier time period from the induction study and includes 
suburban, rural, charter, and private schools. 1 

These comparisons show that estimates of teacher retention can vary over time and across 
different settings and methodologies. Only survey-based estimates are able to account for moves out 
of state and moves into private and charter schools, but estimates based on survey data can vary 
depending on when the data were collected, and whether public school teachers or a broader set of 
teachers form the population of interest. 

2. Sensitivity of Impact Findings 

We conducted several sensitivity tests to examine the robustness of the findings regarding the 
impacts of comprehensive induction on teacher retention. The results of these tests, shown in 
Tables I). 11 anti D.12, suggest that under a wide range of assumptions, we confirmed the finding 
that there was no significant impact of the treatment. 

The conclusions did not change when we used an enhanced weight that incorporated 
information Irom the teacher background survey or when no weights were used."’ Nor did they 
change when information was incorporated from data sources other than the mobility survey. For 
example, we coded the mobility status of nonrespondents who appeared in the student test score 
databases provided by the districts, reclassifying such teachers as district stayers. Similarly, we 
recoded the mobility status of nonrespondents who were flagged as unlocatablc by the data 
collectors who called and visited the schools, reclassifying such teachers as district leavers. The 
variables edited in this way used more of the sample but led to the same conclusion of no significant 
impact of treatment. 


' The SASS/TFS Sample included only one year of followup, but the sample had teachers of all experience levels. 
Thus, the author was able to calculate cumulative retention rates using the probabilities of turnover for second- and 
third year teachers in the same time period. It therefore combines information from cohorts that began teaching in 1988, 
1989, anti 1990. 

“ Unlike the enhanced weights, the benchmark weights rely only on school characteristics from the Common Core 
of Data compiled by the LIS. Department of Education. The enhanced weights used information on teacher’s gender, 
age, race/ethnicity, home ownership, residence in the district, ACT/SAT score, preparation (whether completed a 
tr.iclition.il four year teacher training program), prior career, prior experience teaching, whether the teacher was hired 
after the schixil year began, whether the teacher attended a selective college /university, whether the teacher majored in 
an education related field, and the amount of student teaching experience. 
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Table D.11. Mobility Impacts After Three Years Under Alternative Assumptions: One-Year Districts 





Difference 

Treatment 

Control 

(Estimated 

Outcome and Assumption Group Mean 

Group Mean 

Impact) 

Retention in the District 

Respondents 




Benchmark weights (benchmark estimates) 

S9.0 

69.5 

•0.5 

No weights 

S8.9 

70.2 

-1.3 

Enhanced weights 

S9.3 

69.1 

0.2 

Alternative variance estimation (randan effects model) 

S9.5 

68.7 

0.8 

Alternative set ct control variables 1 

58.5 

69.7 

-1.3 

Alternative set ct control variables 2 

69.7 

68.9 

0.8 

Linear probability model 

58.9 

70.0 

-1.2 

MUtinomial logit model 

59.2 

69.9 

-0.1 

Respondents and Nonrespondents 




Assume 100% of treatment nonrespondents are movers. 0% of controls 

55.4 

66.8 

-11.4' 

Assume 0% of nonrespondenls are movers 

66.6 

66.8 

-0.2 

Assume 25% of correspondents are movers 

59.0 

71.4 

-2.4 

Assume 50% of correspondents are movers 

55.9 

668 

•0.9 

Assume 100% of ncnrespon dents are movers 

55.3 

52.7 

2.6 

Assume 0% of treatment nonrespondents are movers. 100% of controls 

56.6 

52.8 

13.8’ 

Respondents and Selected Nonrespondents 




Recode selected nonrespondenls from other data sources 

70.6 

71.7 

-1.0 

Recode selected nonrespondents and assume 100% of other correspondents are 
movers 

50.5 

57.2 

3.2 

Retention in the Teaching Profession 

Respondents 




Benchmark weights (benchmark estimates) 

58.5 

86.3 

2.3 

No weights 

88.4 

86.4 

2.1 

Enhanced weights 

88.6 

85.9 

2.7 

Alternative variance estimation (random effects model) 

38.3 

65. 9 

2.4 

Alternative set of control variables 1 

58.0 

86.4 

1.6 

Alternative sel of control variables 2 

3B.1 

86.5 

1.6 

Linear probability model 

38.1 

86 8 

1.3 

MUtinomial logil model 

37.3 

84.4 

2.9 

Respondents and Nonresportdents 




Assume 100% of treatment nonresportdents are leavers. 0% of controls 

77.7 

88.4 

-10.7’ 

Assume 0% nonrespondents ate leavers 

39.2 

88.2 

1.0 

Assume 25% of correspondents are leavers 

86.1 

85.4 

0.7 

Assume 50% of nonrespondenls are leavers 

83.4 

81.3 

2.1 

Assume 100% of notrespoodents are leavers 

77.7 

74.2 

3.4 

Assume 0% of treatment nonrespondenls are leavers. 100% of controls 

89.2 

74.2 

14.9' 

Respondents and Selected Nonrespondenls 




Recode selected nonrespondenls from other data sources 

88.8 

87.0 

1.8 

Recode selected nonresportdents and assume 100% of other nonrespondenls are 
leavers 

82.9 

78.7 

4.1 

Sample Size (Teachers) 

Respondents 

215 

201 

416 

Respondents and Selected Nonrespondenls 

228 

215 

443 

Respondents and Nonrespondents 

267 

265 

532 


Source: Mathematics Th.rd Teacher Mobility Survey adnwiistered in (all 2003 lo al study teachers. 

•Sgiuftcanfly different (rom zero at the 0.05 level. 
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Table D.12. Mobility Impacts After Three Years Under Alternative Assumptions: Two-Year Districts 



Treatment 

Control 

Difference 


Group Mean 

Group 

(Estimated 

Outcome and Assumption 


Mean 

Impact) 

Retention In the District 




Respondents 




Benchmark weights (benchmark estimates) 

64.8 

60.8 

4.0 

No weights 

64.9 

60.7 

4.2 

Enhanced weights 

64.8 

60.8 

4.0 

Alternative vanance estimation (random effects model) 

66.0 

60.7 

5.3 

Alternative set of control vanables 1 

65.9 

61.3 

4.6 

Alternative set of control vanables 2 

64.5 

61.3 

3.2 

Linear probability model 

65.3 

61.1 

4.2 

Multinomial logit model 

64.9 

60.6 

4.3 

Respondents and Nonrespondents 




Assume 100% of treatment nonrespondents are movers, 0% of controls 

55.5 

63.3 

-7.8 

Assume 0% of nonrespondents are movers 

62.1 

62.9 

-0.8 

Assume 25% of nonrespondents are movers 

67.4 

63.4 

4.0 

Assume 50% of nonrespondents are movers 

65.3 

58.3 

6.9 

Assume 100% of nonrespondents are movers 

55.9 

46.5 

9.4* 

Assume 0% of treatment nonrespondents are movers. 100% of controls 

62.5 

46.0 

16.5* 

Respondents and Selected Nonrespondents 




Recode selected nonrespoiktents from other data sources 

66.4 

64.6 

1.8 

Recode selected nonrespondents and assume 100% of other 

58.5 

55.3 

3 2 

nonrespondents are movers 




Retention in the Teaching Profession 




Respondents 




Benchmark weights (benchmark estimates) 

84.4 

85.1 

-0.7 

No weights 

84.4 

84.9 

-0.4 

Enhanced weights 

84.4 

84.9 

-0.5 

Alternative vanance estimation (random effects model) 

85.1 

84.7 

0.3 

Alternative set of control vanables 1 

85.0 

85.1 

-0.1 

Alternative set of control vanables 2 

84.4 

85.2 

-0.8 

Linear probability model 

84.8 

85.0 

-0.2 

Multinomial logit model 

83.4 

84.0 

-0.6 

Respondents and Nonrespondents 




Assume 100% of treatment nonrespondents are leavers, 0% of controls 

79.2 

87.4 

-8.2* 

Assume 0% of nonrespondents are leavers 

85.9 

87.1 

-1.2 

Assume 25% of nonrespondents are leavers 

85.4 

83.1 

2.3 

Assume 50% of nonrespondents are leavers 

83.3 

78.4 

4.9 

Assume 100% of nonrespondents are leavers 

79.9 

70.7 

9.2* 

Assume 0% of treatment nonrespondents are leavers, 100% of controls 

86.5 

70.1 

16.4* 

Respondents and Selected Nonrespondents 




Recode selected nonrespondents from other data sources 

85.3 

85.9 

-0.6 

Recode selected nonrespondents and assume 100% ot other 

82.2 

79.6 

2.6 

nonrespondents are leavers 




Sample Size (Teachers) 




Respondents 

190 

154 

344 

Respondents and Selected Nonrespondents 

196 

172 

368 

Respondents and Nonrespondents 

222 

199 

421 


Source: Mathematics Third Teacher Mobility Survey adnwxstered In fall 2008 to all study teachers. 

•Significantly different from zero at the 0.05 level. 

When wc rc-cstimatcd the treatment effects using a linear probability model and a multinomial 
logit model, wc reached the same conclusions as when we used the benchmark model, which 
consisted of separate logistic regressions to predict the probability of being a stayer or a nonlcaver 
(sec Tables D.l 1 and D.12). 
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Tables D.ll and D.12 also present the results of imposing alternative assumptions regarding 
nonrespondents to the mobility survey. For that exercise, we recalculated the impact estimates under 
a range of different assumptions about the retention rates lor nonrespondents. Under a wide range 
of assumptions, the treatment-control differences were not statistically significant. Under the most 
extreme assumptions, however, there could be significant treatment effects on retention ranging 
from -1 1 to +16 percentage points. Since there is no evidence to support these extreme 
assumptions, we report them here for context only. 

Breaking down the overall impact estimates on retention into separate impacts by district shows 
that the results arc not driven by one or two outlier districts. Figures D.2 and D.3 show impacts on 
retention of teachers in their original district, while Figures D.4 and D.5 show impacts on retention 
in the profession. Although the study was not designed to detect district-specific estimates, these 
results illustrate the possible heterogeneity across districts. I lowevcr, the lack of obvious 
discontinuities in the distributions is consistent with the hypothesis that the variation in district- 
specific results represents a single treatment effect along with sampling variation. 

D. Supplementary Analysis of Composition of the Workforce 

Another set of data from Chapter VI that we expand on here is the comparison of treatment 
and control stayers. The rationale for comparing stayers to stayers only without regard to the movers 
or leavers, for whom we also have data, can he summarized as follows. The composition effect that 
we arc attempting to estimate is the difference between the average teacher quality under the 
treatment and the average teacher quality that what would have been realized in the absence of 
treatment. We recognize that the average teacher quality in both cases depends on the quality of 
teachers who stay and the quality of teachers who replace those who leave. 

The true experimental impact is the difference between (a) the average outcomc/charactcristics 
of treatment stayers plus teachers who replace treatment leavers and (1)) the average outcome of 
control stayers plus teachers who replace control leavers. Replacement teachers presumably come 
from a common pend of candidates who replace the leavers. W e assume that the treatment has no 
impact on the quality ot replacement teachers or the school principal’s hiring process, in other 
words, that replacement teachers in treatment and control schools are equally well qualified. This 
way of formulating the experimental impact can be expressed in Equation (D.1): 


A = M + (i -WH4X + (1 - K )>', J 
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Figure D.2. Impacts on Teacher Retention in the District After Three Years by District: One-Year Districts 

100 



District 


Source: Mathematics Teacher Background Survey administered in fall 2005 and Third Teacher Mobility Survey 

administered in fall 2008 to all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Data are weighted and 

regression adjusted using ordinary least squares to account for differences in benchmark covariates 
and robust variance estimation to account for study design and the clustering of teachers within 
schools. Plot symbols represent the difference between regresson-adjusted treatment and control 
mean within each district, and the vertical lines show the 95 percent confidence interval around each 
point. District codes A through J are arbitrary. Districts are ordered according to the size of the impact. 
N=464 teachers. 

‘Significantly different from zero at the 0.05 level. (No adjustment is applied for multiple comparisons.) 


In this equation }' Y )' and Y r represent the mean outcomes for treatment stayers, control 
stayers, and replacement teachers, respectively; A represents the impact of interest; and A, and A c 
represent the retention rates for the treatment and control group respectively. 

We do not have data on replacement teachers, but we do not need to measure their outcomes 
explicitly. If we assume further that replacement teachers arc similar to control teachers, because 
they all came from the same stream of teaching candidates, in other words that Y C = Y I then the 
impact reduces to the difference between treatment stayers' mean and control stayers’ mean (sec 
Equation D.2), which is what we reported in Chapter VI. 




(D.2) 


Nevertheless, we present the full set of mean characteristics for stayers, movers, and leavers in 
Tables D. 13 and I). 14 to complement Tables VI. 1 and VI.2 and provide additional context. 
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Figure D.3. Impacts on Teacher Retention in the District After Three Years by District: Two-Year Districts 
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Source: Mathematica Teacher Background Survey administered in fall 2005 and Third Teacher Mobility Survey 

administered in fall 2008 to all study teachers. 

Note: Data pertain to teachers in tvro-year districts participating in the study. Data are weighted and 

regression adjusted using ordinary least squares to account for differences in benchmark covariates 
and robust variance estimation to account for study design and the clustering of teachers within 
schools. Plot symbols represent the difference between regresson-adjusted treatment and control 
mean within each district, and the vertical lines show the 95 percent confidence interval around each 
point. District codes K through O are arbitrary. Districts are ordered according to the size of the impact. 
N=375 teachers. 

None of the differences is statistically significant at the 0.05 level. 
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Figure D.4. Impacts on Teacher Retention in the Profession After Three Years by District: One-Year Districts 
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Source: Mathematics Teacher Background Survey administered in fall 2005 and Third Teacher Mobility Survey 

administered in fall 2008 to all study teachers. 

Note: Data pertain to teachers in one-year districts participating in the study. Data are weighted and 

regression adjusted using ordinary least squares to account for differences in benchmark covariates 
and robust variance estimation to account for study design and the clustering of teachers within 
schools. Plot symbols represent the difference between regression-adjusted treatment and control 
mean within each district, and the vertical lines show the 95 percent confidence interval around each 
point. District codes A through J are arbitrary. Districts are ordered according to the size of the impact. 
N=464 teachers. 

None of the differences is statistically significant at the 0.05 level. 
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Figure D.5. Impacts on Teacher Retention in the Profession after Three Years by District: Two-Year Districts 


100 



District 


Source: Mathematics Teacher Background Survey administered in fall 2005 and Third Teacher Mobility Survey 

administered in fall 2008 to all study teachers. 

Note: Data pertain to teachers in two-year districts participating in the study. Data are weighted and 

regression adjusted using ordinary least squares to account for differences in benchmark covariates 
and robust variance estimation to account for study design and the clustering of teachers within 
schools. Plot symbols represent the difference between regresson-adjusted treatment and control 
mean within each district, and the vertical lines show the 95 percent confidence interval around each 
point. District codes K through O are arbitrary. Districts are ordered according to the size of the impact. 
N=375 teachers. 

"Significantly different from zero at the 0.05 level. (No adjustment is applied for multiple comparisons.) 
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Table D.13. Characteristics of District Stayers, Movers, and Leavers After Three Years, by Treatment Status 
(Percentages Except Where Noted): One-Year Districts 




Stayers 



Movers 



Leavers 



Treat- 


Differ- 

Treat- 


Differ- 

Treat- 


Differ- 

Teacher Characteristic 

ment 

Control 

ence 

ment 

Contrd 

ence 

ment 

Control 

ence 

Characteristic 










College entrance exam scores (SAT 

1040 

1013 

27 

1030 

1033 

-3 

1010 

1083 

-73 

combined score or equivalent) 

Attended highly selective college 

27.5 

27.2 

0.3 

26.6 

39.1 

-124 

44.3 

312 

13.1 

Major or mhor in education 

78.7 

80.9 

-2.1 

69.4 

75.8 

-6.4 

83.5 

602 

23.3’ 

Student teaching experience (weeks) 

15.8 

15.4 

0.4 

13.9 

14.3 

-0.4 

18.3 

11.6 

6.7* 

Kghest degree is master's or doctorate 

22.4 

28.2 

-5.8 

19.2 

26.7 

-7.5 

29.5 

31.8 

•2.3 

Entered the profession through 

67.6 

589 

8.7 

66.6 

61.0 

5.6 

48.3 

34.6 

13.6 

traditional four-year program 

Certified (regular or probationary) 

94.7 

94.7 

0.0 

948 

96.3 

-1.5 

90.6 

96.8 

•62 

Sample Size (Teachers) 

148 

SI 


39 

31 


29 

31 


Sample Size (Schools) 

88 

mm 


30 

28 


22 

28 


Year 1 Classroom Observation Score (on 1 
to 5 scale) 










Content of a lleracy lessen 

2.3 

2.6 

-0.3" 

2.1 

2.1 

0.1 

2.8 

2.1 

06‘ 

Implementation of a literacy lessen 

2.6 

2.8 

-0.2 

2.6 

2.4 

0.1 

2.9 

2.6 

0.3 

Classroom culture 

3.0 

3.1 

-0.1 

2.9 

2.8 

0.1 

3.5 

3.0 

0.5 

Sample Size (Teachers) 

100 

94 


26 

23 


17 

18 


Sample Size (Schools) 

71 

65 


20 

20 


14 

18 



Source: Malhemalica calculation using data from the College Board and ACT. Inc.; Mathematics Third Teacher Mobility 

Survey administered in fall 2008 to all study teachers. Malhematica classroom observations conducted in spring 
2006. 

Note Data are weighted to account tor the study design. Sample sizes vary due to item norresponse. The analysis of 

©allege entrance exam scores relied on a smaller sample ot teachers (84/27/14 treatment stayers'movers'leavers 
and 88123118 control stayers’movers'leavers) and schools (61/22/12 treatment and 62/21/17 control). Slayer: 
retained in the same school district. Mover: retarred rt the teaching pro/essicn. but not in the same school district. 
Leaver, no longer teaching. 

None of the differences between treatment aixl control stayers, between treatment and control movers, or between treatment and 
control leavers is statistically significant at the 0.06 level, p-values are suppressed to make the table easier to read. 
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Table D.14. Characteristics of District Stayers, Movers, and Leavers After Three Years, by Treatment Status 
(Percentages Except Where Noted): Two-Year Districts 




Stayers 



Movers 



Leavers 



Treat- 


Differ- 

Treat- 


Differ- 

Treat- 


Differ- 

Teacher Characteristic 

ment 

Control 

ence 

ment 

Control 

ence 

ment 

Control 

ence 

Characteristic 










College entrance exam scores (SAT 

905 

935 

•30 

1022 

1027 

-5 

1058 

1124 

-76 

combined score or equivalent) 
Attended highly selective college 

23.7 

21.4 

2.3 

37.5 

38.5 

-1.0 

42.3 

49.8 

-7.5 

Major or mrtor in education 

67.8 

66.6 

1.2 

64.3 

74.8 

-10.5 

60.3 

67.6 

-7.2 

Student teaching experience (weeks) 

12.3 

12.3 

0.1 

11.4 

14.1 

-2.7 

9.9 

13.2 

-3.3 

Highest degree is master's or 

16.7 

10.2 

6.5 

102 

27 2 

-17.0" 

25.4 

0.0 

25.4* 

doctorate 










Entered the profession through 

61.1 

66.4 

-5.4 

61.9 

68.0 

•6.2 

35.4 

65.0 

-29.7‘ 

traditional four-year program 
Certified (tegular or probationary) 

958 

92.9 

2.9 

91.4 

942 

•2.6 

81.5 

84.3 

-2.8 

Sample Size (Teachers) 

124 

93 


35 

36 


31 

26 


Sample Size (Schools) 

67 

52 


28 

27 


22 

21 


Year 1 Classroom Observalon Score 
(on 1 to 5 scale) 










Content of a Iteracy tessoo 

2.4 

2.4 

0.0 

2.2 

2.4 

-0.1 

2.3 

2.3 

0.0 

Implementation of a literacy lessen 

2.7 

2.6 

0.1 

2.6 

2.7 

•0.2 

2.7 

2.3 

0.3 

Classroom culture 

3.1 

3.1 

0.1 

2.7 

3.0 

•0.3 

2.9 

2.8 

0.2 

Sample Size (Teachers) 

87 

62 


28 

27 


24 

16 


Sample Size (Schools) 

50 

41 


22 

21 


18 

12 



Source; Mathematica calculations using data from the College Board and ACT, Inc.; Mathematica Third Teacher Mobility 
Survey administered in fall 2008 to ail study teachers. Mathematica classroom observations conducted in spring 
2006. 

Note Data are weighted to account for the study design. Sample sizes vary due to item nonresponse. The analysis of 

college entrance exam scores relied on a smaller sample of teachers (56/20/21 treatment stayers'movers'teavets 
and 47/17/16 control stayerSmoverSleavera) and schools (40/17/17 treatment and 35/13/14 control). Stayer: 
retained in the same school district. Mover: retailed h the teaching profession, but not in the same school district. 
Leaver, no longer teaching. 

tone of the differences between treatment aid control stayers, between treatment and control movers, or between treatment and 
control leavers is statistically significant at live 0.06 level, p-values are suppressed to make live table easier to read. 
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